Difference between revisions of "Software carpentry"

From mn/geo/geoit
Jump to: navigation, search
(Created page with "This text is based on a workshop given at UiO 26.-27. February 2015. Written by IBN. [http://software-carpentry.org/lessons.html http://software-carpentry.org/lessons.html] =...")
(No difference)

Revision as of 10:38, 11 January 2019

This text is based on a workshop given at UiO 26.-27. February 2015. Written by IBN. http://software-carpentry.org/lessons.html

Basic terminal commands

Navigating and creating folders

Where am I? (pwd)

pwd  # print working directory

shows you which directory you are in.

Change directory (cd)

cd ./pc/Dokumenter

changes into another folder

Make folder (mkdir)

mkdir chosenfoldename/ # the end slash is optional, but shows that this is a folder.

Creates a folder in your path.

Send output to file, not to screen (>)

pwd > path.txt

don't send the working directory to the terminal, but save it in a file.

Send the output of one commands to another command - pipe (|)

ls -l | grep key | less

With pipe (|), program1 is done before program2 before program3. Here, the content of our directory is listed (ls -l), then we search for the characters 'key', then we send that to less, where it is displayed in a different buffer. (q quits the 'less buffer').

Show all files ending with A.txt or B.txt (*)

ls *A.txt *B.txt
ls *[AB].txt

These two commands do the same.

Sort files numerically and extract smallest/highest value (sort)

sort -n *[AB].txt | head -n 1

Finds the smallest (highest: tail -n 1) value in all files ending with A.txt or B.txt.

Loop over many files in a folder

for filename in *[AB].txt; do echo $filename; done

Loop over A.txt and B.txt then show the content of the variable. It shows all files ending with A.txt or B.txt.

If you give the variable a content, don't use dollar, if you are using the variable, do use a dollar.

for filename in *[AB].txt; do echo $filename; sort -n $filename | head -n 1; done

This will also print the minimum value of each file.

for filename in *[AB].txt; do echo $filename; sort -n $filename | head -n 1 > smallest-$filename; done

Here we print the filenames, but create one file for each of the smallest numbers.

Files

Create file (touch)

mkdir thesis/ 
touch thesis/draft.txt

creates an empty file in an existing folder.

Write to file

nano draft.txt
emacs -nw draft.txt
vim draft.txt

These commands open different text editors.

View content of file

cat draft.txt
less draft.txt
more draft.txt

lets you view the content of the file in the terminal.

Count words or lines in file (wc)

wc -l ethane.pdb
> 20 156 1158 ethane.pdp #

Shows 20 lines 156 words 1158 characters.

wc -l *pdb

Shows number of lines in all files ending with .pdb.

wc -c *.pdb > lengths.txt
sort -n lengths.txt

See start or end of a file

head -n myfile.txt

shows the first 10 lines

tail -n 1 myfile.txt

shows the last 1 line

Create a file with the output of a command

sort -n myfile.txt | head -n 1

sort the file myfile.txt, then output the first line of that. Could also have written:

wc -l *.pdb | sort -n | head -n 1


How many files are in my directory?

ls *.txt | wc -l

List the files in the directory, then count the lines of that.

Which file has the fewest number of lines?

wc -l *.txt | sort -n | head -n 1

Count number of lines in all text files in the folder, then sort them numerically, then show the first line.

Remove files or folders

rm

rm file

removes a file

rmdir

rmdir myfolder/

deletes empty folders

rm -r

rm -r myfolder/

deletes non-empty folders. This is very dangerous because there is no recycle bin! Always run

ls myfolder/

before rm -r myfolder/, to check what you are about to remove. Don't ever write

rm -r .

It deletes everything. And make sure you have a backup.


The unix shell is not forgiving.

Which commands did I type? (history)

history

shows what you typed.

history | less 

sends the output to the fileviewer. Press q to escape.

history | tail | head -n 3 > smallest.sh

shows the commands in the middle: show the first lines of the last part.

Shell scripts

Either make a small textfile called script.sh, taking in files in a given directory, named N*[AB].txt.

# for each file, enter the smallest number in a new file
for filename in N*[AB].txt
do
echo $filename
sort -n $filename | head -n 1 > smallest-$filename
done

To run this, type:

bash script.sh 
(or source script.sh)

Or generalise your script by changing N*[AB].txt with "$@" and give an argument to the call.

for filename in "$@"  # for the list of files in the directory given on the com$
do
echo $filename
sort -n $filename | head -n 1 > smallest-$filename
done

To run this, type:

bash script.sh N*[AB].txt


Version control (git)

Installation

If Git is not already available on your machine you can try to install it via your distro's package manager. For Debian/Ubuntu run sudo apt-get install git and for Fedora run sudo yum install git. from http://karinlag.github.io/2015-02-26-Oslo/

Setup

First time you use git, tell it what your name and email address is.

git config --global user.name "Irene Brox Nilsen" 
git config --global user.email "irenebn@geo.uio.no"

Git terminology

Version control works on text files (.tex, .py, .m, and so on). When you are happy with your text, add this file to the stage using "git add file1.txt". When you want to store your file in the permanent storage (repository), commit this file to the repository using "git commit". The version-ID of the last stored file is called HEAD. The one (two) before the last is called HEAD~1 (HEAD~2). Push means to upload to github, pull means download from github.

Git commands

git init # starts git: "there is something here to keep track of"

Now we have created a hidden .git directory. It checks files and figures out which version to use.

git status

Checks what git is keeping track of. Now, create a file git can keep track of (mars.txt). git status will now say there is untracked files. We have to ask it to pay attention to it.

git add file1.txt

Now, git status says that git has registered a new file it hasn't seen before. But we have not asked git to save changes yet. Terminology: "make a commit" means to save a snapshot of the contents of the file.

git commit -m "Starting to think about project"           # -m means messages; why you are saving a snapshot now.

git status will not say that everything is clean, meaning that the stored version and the one you're working on are identical. Instead, use log:

git log

The commit is identified with a complicated number, the commit ID. This changes for every character change in the file you're editing. It is also unique, so you only have to copy the 6-8 first characters instead of the whole thing when you want to compare two files (git diff 45b9d809 j3dsf467 file1.txt). If you edit your file and run git status, it tells you that the file has changed. To see what's different, type:

git diff

This is on the patch format: compares two files a/file1.txt and b/file1.txt. a/ is the old, b/ is the version we're having now. To save the changes to git, tell git which changes should be logged to which files:

git add file1.txt 
git add file2.txt      
git commit -m "introduction written" 

This commit saves changes for all files added to git. Could also have written git add *.txt to add all text files at once. Note: you never have file names with commit, only with add.

When you run git status and get "Changes not staged for commit" means that you have to have to run git add file1.txt. Then you'll run get status again and get "Changes to be commited". Now you might want to look at the difference again before storing permanently:

git diff --staged

This tells you the difference between the stage and the permanent storage.

git commit -m "methods drafted"

And you'll store these changes in the permanent storage (repository). At last, run git staus again.

How to go back to previous versions?

Start every morning typing git status. What did I do yesterday?

git status

Oh! Right, I over-wrote most of the text. That wasn't too clever, was it?

You can easily find your last commit-ID by typing HEAD (instead of finding the commit ID).

git diff HEAD~1 file1.txt

Differences between the last commit and the previous commit, in the repository. I want to get the previous version back:

git checkout HEAD file1.txt
less file1.txt                

Is the brilliant stuff in this version? If not, check the previous one

git checkout HEAD~1 file1.txt
less file1.txt                

Is the brilliant stuff in this version? If not, check the previous one

git checkout HEAD~2 file1.txt
less file1.txt                

Yes, this is the one. Run git status to check if there is nothing to commit, and then it is ok.

Don't forget the filename when running checkout! If you type git checkout HEAD, it rolls out all previous versions of all files you have backup of!

Store things on github

Tool such as https://github.com/ lets you store your files openly for everyone on the internet, https://bitbucket.org/ is a closed repository.

To add things on github, first create a github account, create a new repository and link your computer (the terminal you are using) to github by copying in the github link into this command:

git remote add origin https://github.com/irenebn/planets.git

Here, we register this link, with the alias 'origin'. Check if it went ok:

git remote -v

It should give this:

origin	https://github.com/irenebn/planets.git (fetch)
origin	https://github.com/irenebn/planets.git (push)

To upload things,

git push origin master # then log in with github username and password

To download things

git pull origin master

If someone adds you to their repository, you'll get a link to their repo. Copy it, and paste it into your terminal:

git clone https://github.com/user2/repo.git

you'll get access to their repo as a folder. Check it using

git status        # nothing to commit
git log           # history of user2's commits
git remote -v     # now "origin" points to user2's repo

You may now create a new file.

git status
git add file3.txt
git commit -m "added some notes on topic 3"

It is saved locally, your partner cannot see it yet. To upload it to user2's repo,

git push origin master 

Make sure that you're working on two directories locally: your own and user2's.

Working on the same file

If you change line 1 and user2 changes line 10, there's no conflict, and everything is fine. What happens if you change both the same line, in your local copies? Both people push the file to the same repo (not at the same time).

git push origin master # I upload my things
git push origin master # User2 uploads her things

Her push will be rejected. She have to go to github, get my changes and decide which line is the best one. She will get a message saying "CONFLICT ... fix the conflicts and then push the result". Git has now added the lines from both users into the file, and you can delete what you don't want or merge the two versions. Then commit that and push it.


In git, you need to pull before you push