Just a heads up, we don't have a huge amount of space on this machine, ~750 GB for the git repos. We can include some data in the projects, but really big datasets will need to remain elsewhere. For anyone new to Git, this is a fairly good place to start: http://gitref.org/index.html Documentation for Gitlab is available here: http://doc.gitlab.com/

Commit 115508e3 authored by Khalid Kunji's avatar Khalid Kunji

Update README.md

parent c263e32c
# Run GIGI Split and MergeScripts
# Run GIGI Split and Merge Scripts
#### Runs GIGI with multiple threads by splitting the input and merging the output.
### Runs GIGI with multiple threads by splitting the input and merging the output.
#### Requirements
Somewhat modern version of g++ if you need to recompile the binaries, we haven't checked how far back you can go but the default for most OS package managers should be fine.
Usage: run_GIGI parameter_file -o [OUTPUT FOLDER] -n [RUN NAME] -t [THREADS] -m [MEMORY IN MB]
#### Getting run_GIGI
##### With Git
git clone https://cse-git.qcri.org/Imputation/Impute_Beaming.git
##### With a browser
Go to this url: https://cse-git.qcri.org/Imputation/Impute_Beaming
Click on the icon with the down arrow just to the left of the "+" icon.
There are several download options with different compressions, if you get run_GIGI this way, then you will need to decompress it before proceeding
#### Installation
Once you have the files, there are executables compiled on Red Hat, 64 bit Linux, if this is not your system, then you may need to recompile them:
Starting from the folder where you downloaded run_GIGI
cd Impute_Beaming/SPLIT/
g++ GIGISplit.cpp -o gigisplit
cd ../MERGE/
g++ GIGIMerge.cpp -o gigimerge
That's it, run_GIGI is installed
#### Usage
Note: The parameter file is the same as you would use for GIGI normally, but if you are using the long format, then pass the "-l" option
The examples in shown below use the file "param-v1_06.txt" because it is included in the repository and can be run by simply cutting and pasting the example line.
Note: Memory constraints are not yet implemented, use your own judgement, we have not yet seen an example that won't fit in 6GB comfortably.
run_GIGI parameter_file -o [OUTPUT FOLDER] -n [RUN NAME] -t [THREADS] -m [MEMORY IN MB] [-l]
-o [OUTPUT FOLDER] : This is the path to use for the outputs from the run_GIGI scripts, including temporary files.
-n [RUN NAME] : This is a path relative to the [OUTPUT FOLDER] to use to keep the outputs from more than one run of run_GIGI separated.
-t [THREADS] : The number of threads to use for run_GIGI, and also the number of chunks to split the input into.
-m [MEMORY IN MB] : The amount of RAM that run_GIGI will restrict its use to, not yet implemented
-l : Specifies that the input is in the long format.
Examples:
```bash
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt #Output in the current folder with no run name identifying subfolder, threads and memory determined automatically
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -o ./OUTPUTS -n test_run #Output in ./OUTPUTS/test_run
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -o ./OUTPUTS -n test_run -l #Output in ./OUTPUTS/test_run for a parameter file in the long format, do not cut and paste this one because the included param-v1_06.txt is not in the long format
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -t 2 #Limit to only 2 threads (and hence two chunks)
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -m 1000 #Limit memory use to 1 GB, not yet implemented, asking for both thread and memory limits may result in impossible scenarios.
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -mt 1000 2 #Limit memory use to 1 GB and 2 threads, not yet implemented, asking for both thread and memory limits may result in impossible scenarios.
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -lmt 1000 2 #Limit memory use to 1 GB and threads to 2 with input in the long format, not yet implemented, asking for both thread and memory limits may result in impossible scenarios, do not cut and paste this one because the included param-v1_06.txt is not in the long format
```
####NOTE:
# The parameter_file (e.g., param-v1_06.txt) is the input needed for GIGI without splitting.
\ No newline at end of file
If there is a problem that makes GIGI stop before completion, then the output files are left as they are in order to allow users to rerun only failed portions as needed.
If you are unsure where the failure occurred, then the safest approach will be to remove the output files before rerunning (e.g. rm -R [OUTPUT FOLDER]/[RUN NAME]), always use rm with caution
e.g. if the 2nd example failed, I would "rm -R ./OUTPUTS/test_run" before rerunning.
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment