Difference between revisions of "Getting started with GitHub"

Jump to: navigation, search
Line 12: Line 12:
Since GitHub is used extensively outside the research community there are a lot of resources online on how to get started on GitHub. Some of those resources expect technical skills, but the list below links to resources that does not:  
Since GitHub is used extensively outside the research community there are a lot of resources online on how to get started on GitHub. Some of those resources expect technical skills, but the list below links to resources that does not:  
* https://guides.github.com/ - GitHub's own guide on how to get started
* https://guides.github.com/ - GitHub's own guide on how to get started
=== Specific sections in GitHub's guide we recommend to researchers learning to use GitHub ===
Some topics discusses in the GitHub guide are not relevant in research, but we recommend resaechers to read the topics described in the follow sections and to use those topics frequently.
* [https://guides.github.com/features/issues/ issues]
* [https://guides.github.com/features/wikis/ documentation]


== Best  practices for managing a research project using GitHub ==
== Best  practices for managing a research project using GitHub ==

Revision as of 12:42, 17 May 2018

This page provides resources and links to resources on how to get started with GitHub. There are other Git alternatives to GitHub but most of these resources are applicable to those alternatives as well. See for example GitLab and Bitbucket.

The World Banks GitHub repositories can be found at www.github.com/worldbank.

Read First

What GitHub is good at and what it is less good at

Git was implemented to manage code work and doing so by tracking changes made to code in great detail. This is the reason why Git is an amazing tool to collaborate on code, but the draw back is Git is only efficient in tracking changes to raw text files. All code files in any programming language are always raw text files, and so is .tex, .txt, .csv files, .doc/.docx, .xls.xlsx, .pdf files and images are examples of binary files that are not raw text files. Binary file are stored very efficiently but Git does not have direct access to the text and numbers in those files and can therefore not track changes in detail. Git therefore stores one full version of binary files for each change made to them, which gets very inefficient. See the sections on ignore files and combining GitHub and DropBox below for how to relate to this.

Resources for absolute beginners

Since GitHub is used extensively outside the research community there are a lot of resources online on how to get started on GitHub. Some of those resources expect technical skills, but the list below links to resources that does not:

Specific sections in GitHub's guide we recommend to researchers learning to use GitHub

Some topics discusses in the GitHub guide are not relevant in research, but we recommend resaechers to read the topics described in the follow sections and to use those topics frequently.

Best practices for managing a research project using GitHub

Ignore files

Ignore files is a very important tool to control what in your data work folder that you will share in the cloud. This is a good way to

Combining GitHub and DropBox

In research we often want to use a syncing service like DropBox, OneDrive etc. in combination with GitHub. This requires a specific setup as GitHub is also a syncing serve, although it works very differently compared to DropBox, OneDrive etc.

Combining GitHub and DropBox is a great way to share data and binary files across team members without leaking private data in the GitHub cloud and to get around that GitHub tracks binary files in a way that is very inefficient in terms of disk space. See this guide for how to combine GitHub and DropBox. This guide includes some slightly more technical steps, but it solves a big issue, and is easy to maintain once it is set up.

Back to Parent

This article is part of the topic Data Management