Managing development infrastructure in an academic lab involves practical tasks such as managing user accounts, file systems, and development servers. In addition, there’s need for methodology to handle projects, using version control repositories. We have to manage source code, data and measurements. In a typical lab, all of these things are addressed in an ad-hoc fashion. Faculty don’t have the time (and often, knowledge) to dive into the details to address these tasks, and graduate students don’t stick around long enough to introduce a stable long-term policy.
Over the past two years, GitHub has taken over a large portion of the development infrastructure in my group. Traditionally we did this following an ad-hoc method involving servers, Subversion, Apache, user accounts, disks, filesystems, and so on. But, thanks to the cloud and the folks at GitHub, that is now largely a thing of the past. (I’m not a GitHub spokesperson nor a GitHub expert. And I pay for their service as much as every other academic: nothing).
In this post, I’m relating my experience with using GitHub Classroom, a component in GitHub where the students in your class can manage their development using git. If you don’t know what git is, then you probably should skip this post and start with an introduction.
Several of the classes I’m teaching, including Microcontroller Programming and Hardware/Software Codesign, require the development of code. Usually the students don’t have to program or create more than 2 or 3 files, but to make these work, you need a larger environment. The idea of GitHub Classroom is that you can set up assignments as repositories, which can be private between the student and TA/faculty. Each assignment is copied from a reference repository, so that the students can get starting code. As the students complete the assignment, they can update their own repository with the solution. When the assignment deadline comes around, the most recent commit can be saved as the assignment submission.
There are many reasons why I like this model, and below are a few of the obvious arguments.
Most of the Homework and Projects we do in our courses involve larger development trees, which can contain hundreds of files. The solutions for Homework and Projects are a heterogeneous set of file formats including source code, binary code, graphics, PDF, and text files. Therefore I prefer to provide the assignment as a full development tree, rather than an archive or a tarball. The latter is cumbersome, error-prone (I’m always forgetting to add something), unreliable, and it’s often incompatible between file systems. The git repository seems to be the right format to provide an assignment.
Students can return the answer to a Project or Homework as a new version of the repository, which seems to work quite well. They don’t forget to include every change they made, since the version control system handles it. And, for grading, it’s easier to get access to the students’ files. We’ve used (and are still using) many course management systems such as Blackboard, Canvas, or Sakai. None of these systems seems to work as well for Computer Engineering problems as a version control system.
Doing Homework or Projects as git repositories also enables me to see the design of the students as they make progress. I often get emails such as ‘Professor, I’ve been looking at this file for a very long time, could you please give me a suggestion?’. Since they work on a version control system, I can ask them to push a new version to their Homework repository, and I can directly inspect what they are working on. This saves a lot of time, since I don’t have to spend time to understand the development context from a small snippet of problematic code.
Of course, there is some extra overhead that comes with working with repositories, and I wouldn’t claim that it is useful for every programming course. First, the students have to learn git and the concept of distributed version control. That seems a reasonable requirement, as you would expect any computer engineer to be able to handle development under version control. Second, git is not silver bullet. A version control system only makes the distribution and collection of assignments easier, but a bad developer using git will still be a bad developer.
In the Hardware/Software Codesign course of this semester, we had 65 students and 9 different assignments: eight Homework and a Codesign Challenge. So by the end of the semester, we were managing 632 repositories, one for each student and for each assignment. The chart above illustrates the aggregate repository creation and update operations. The X-axis represents the semester, and the Y-axis shows the number of operations per day. The different colors belong to different assignments, and the graphs show, from left to right, Homework 2 until 8, and the final Codesign Challenge.
There are some obvious observations. First, students like to procrastinate. The peaks in the operation always come right before the deadline. I tell the students that working at a steady pace is less tiring and more valuable in the long term, but so far I haven’t found a proper incentive to make them also do it. It’s just human nature. Second, using git requires some additional discipline to systematically push new developments into the repository. Students don’t work that way: they clone a repository on day one, they work out the solution, and they post the answer using a single commit. Most of the time, at least.
Sometimes, the assignments are too hard, and when the deadline gets closer, the students push their partial solution into the repository. An interesting example of that is the yellow curve in the middle of the graph (Homework 5). That Homework was, even if I say so myself, a very challenging design problem, and we ended pushing back the deadline for a week. The first yellow peak is before the deadline extension. The second yellow peak is after we spend a lecture in class discussing solution strategies.
Overall, I found the concept of combining version control with Homework/Project assignment for Computer-Engineering type of problems to be very effective, scalable, and flexible. For sure I’m planning to use this again in the future.