Difference between revisions of "Naming Conventions"

Jump to: navigation, search
(Created page with "{{subst: dime_wiki}}")
 
 
(29 intermediate revisions by 5 users not shown)
Line 1: Line 1:
<span style="font-size:150%">
<onlyinclude>Impact Evaluation projects should follow a clear file naming convention as many team members will need to understand and interact with files over the project lifetime. It is very important to use a naming convention that not only you understand but someone looking at the files after years also understands. </onlyinclude>
<span style="color:#ff0000"> '''NOTE: this article is only a template. Please add content!''' </span>
</span>


== Read First ==
*Follow clear and consistent naming conventions for all files
*Pay special attention to naming conventions for code-compatible files
*Use version control softwares(such as Git/Github) instead of naming the folders _v01, _v02, old, new, etc.
== Naming requirements for code files ==
Files accessed by code have special naming requirements, since different software and operating systems read file names in different ways.
=== Spaces ===
Introducing spaces between words in a file name (including the folder path) can break a file's path when it's read by code, so while a Word document may be called ''2019-10-30 Sampling Procedure Description.docx'',
a related code file would have a name like ''sampling-endline.do''.


add introductory 1-2 sentences here
=== Timestamps ===
Adding timestamps to binary files as in the example above can be useful, as it is not straightforward to track changes using version control software.
For plaintext files version-controlled using Git, timestamps are an unnecessary distraction.
Output tables, graphs, and documentations are an exception to this and it is good practice to date them. Instead of using versions, they should be dated for clarity. For example = "_2017June8" rather than "_v02".


=== Capitalization ===
Code-compatible files should never include capital letters, as strings and file paths are case-sensitive in some software.


=== Sortability ===
Use names that can be sorted easily, for example through alphabetization. The best names from a coding perspective are usually the opposite of those from an English perspective. For example, for a deidentified household dataset from the baseline round, you should prefer a name like ''baseline-household-deidentified.dta'', rather than the opposite way around as occurs in natural language. This ensures that all baseline data stays together, then all baseline-household data, and finally provides unique information about this specific file.


== Read First ==
==Version Control ==  
* include here key points you want to make sure all readers understand
 
It is generally recommended to use version control software such as Git/GitHub instead of naming the folders _v01, _v02, _old, _new, _final, etc. The only exception to this would be tables, graphs, and documentations and it is good practice to date those. For these kinds of files, it should be dated i.e. ''filename''_2017June08 rather than ''filename''_v02.
 
=== Using Github ===
 
Github is an excellent tool used for version control for documentation, and do files. It also allows users to collaborate on codes together making it easier for multiple people to work on the same code. More information on what Github does and how to get started with Github can be found [https://guides.github.com/activities/hello-world/ here].
 
=== Using Box / Dropbox Version Control features ===
 
Box and Dropbox are both file hosting/cloud storage services which provide version control features.
 
'''''Important : ''''' It is important that you pay attention to your subscription of DropBox/Box when using them for version control as they only store previous versions for certain time. For example - Free Dropbox stores different versions of a files from the last 30 days. Similarly, Box does not allow users with free accounts to access versions of their files.
 
More information on Dropbox's version control, here are some useful links on [https://www.dropbox.com/en/help/11 recovering older versions of a file], [https://www.dropbox.com/en/help/9114 version history information], and [https://www.dropbox.com/help/113 extended version history for Dropbox Pro users].
 
More information on Box's version control, the number of histories saved, and tracking older versions can be found [https://community.box.com/t5/Managing-Your-Content/How-To-Track-Your-Files-and-File-Versions-Version-History/ta-p/329 here].




== Guidelines ==
== Variable Names ==
* organize information on the topic into subsections. for each subsection, include a brief description / overview, with links to articles that provide details
===Subsection 1===
===Subsection 2===
===Subsection 3===


== Back to Parent ==
== Back to Parent ==
This article is part of the topic [[*topic name, as listed on main page*]]
This article is part of the topic [[Data Management]]
 


== Additional Resources ==
== Additional Resources ==
* list here other articles related to this topic, with a brief description and link
* [https://www2.stat.duke.edu/~rcs46/lectures\_2015/01-markdown-git/slides/naming-slides/naming-slides.pdf| Guidelines on naming conventions] from Duke University


[[Category: *category name* ]]
[[Category: Data Management ]]

Latest revision as of 07:46, 16 January 2021

Impact Evaluation projects should follow a clear file naming convention as many team members will need to understand and interact with files over the project lifetime. It is very important to use a naming convention that not only you understand but someone looking at the files after years also understands.

Read First[edit]

  • Follow clear and consistent naming conventions for all files
  • Pay special attention to naming conventions for code-compatible files
  • Use version control softwares(such as Git/Github) instead of naming the folders _v01, _v02, old, new, etc.

Naming requirements for code files[edit]

Files accessed by code have special naming requirements, since different software and operating systems read file names in different ways.

Spaces[edit]

Introducing spaces between words in a file name (including the folder path) can break a file's path when it's read by code, so while a Word document may be called 2019-10-30 Sampling Procedure Description.docx, a related code file would have a name like sampling-endline.do.

Timestamps[edit]

Adding timestamps to binary files as in the example above can be useful, as it is not straightforward to track changes using version control software. For plaintext files version-controlled using Git, timestamps are an unnecessary distraction. Output tables, graphs, and documentations are an exception to this and it is good practice to date them. Instead of using versions, they should be dated for clarity. For example = "_2017June8" rather than "_v02".

Capitalization[edit]

Code-compatible files should never include capital letters, as strings and file paths are case-sensitive in some software.

Sortability[edit]

Use names that can be sorted easily, for example through alphabetization. The best names from a coding perspective are usually the opposite of those from an English perspective. For example, for a deidentified household dataset from the baseline round, you should prefer a name like baseline-household-deidentified.dta, rather than the opposite way around as occurs in natural language. This ensures that all baseline data stays together, then all baseline-household data, and finally provides unique information about this specific file.

Version Control[edit]

It is generally recommended to use version control software such as Git/GitHub instead of naming the folders _v01, _v02, _old, _new, _final, etc. The only exception to this would be tables, graphs, and documentations and it is good practice to date those. For these kinds of files, it should be dated i.e. filename_2017June08 rather than filename_v02.

Using Github[edit]

Github is an excellent tool used for version control for documentation, and do files. It also allows users to collaborate on codes together making it easier for multiple people to work on the same code. More information on what Github does and how to get started with Github can be found here.

Using Box / Dropbox Version Control features[edit]

Box and Dropbox are both file hosting/cloud storage services which provide version control features.

Important : It is important that you pay attention to your subscription of DropBox/Box when using them for version control as they only store previous versions for certain time. For example - Free Dropbox stores different versions of a files from the last 30 days. Similarly, Box does not allow users with free accounts to access versions of their files.

More information on Dropbox's version control, here are some useful links on recovering older versions of a file, version history information, and extended version history for Dropbox Pro users.

More information on Box's version control, the number of histories saved, and tracking older versions can be found here.


Variable Names[edit]

Back to Parent[edit]

This article is part of the topic Data Management

Additional Resources[edit]