Just a heads up, we don't have a huge amount of space on this machine, ~750 GB for the git repos. We can include some data in the projects, but really big datasets will need to remain elsewhere. For anyone new to Git, this is a fairly good place to start: http://gitref.org/index.html Documentation for Gitlab is available here: http://doc.gitlab.com/

Commit 6fd25f10 authored by Khalid Kunji's avatar Khalid Kunji

Minor fixes, lscpu grep, thread number calculation, etc...

parent db91a2a5
......@@ -25,8 +25,8 @@ echo "System Info: " $(uname)
echo "Hostname: " $(hostname)
echo "Host ID: " $(hostid)
echo "Uptime: " $(uptime)
cores=$(lscpu | grep Core | tail -c 2)
sockets=$(lscpu | grep Socket | tail -c 2)
cores=$(lscpu | grep "Core(s)" | tail -c 2)
sockets=$(lscpu | grep "Socket(s)" | tail -c 2)
cores_total=$(($cores*$sockets))
threads_per_core=$(lscpu | grep Thread | tail -c 2)
threads_total=$(($cores_total*$threads_per_core))
......@@ -36,10 +36,15 @@ echo "Number of Threads: " ${threads_total}
export num_chunks=$(($threads_total - $threads_per_core))
#Compare with num_threads
if [ "$num_threads" ]
if [ "$num_threads" -lt "$num_chunks" ]
then
export num_chunks="$num_threads"
fi
if [ "$num_chunks" -lt 1 ]
then
export num_chunks=1
echo "Trouble determining number of threads correctly, using 1 as a failsafe."
fi
echo "Number of chunks to split into: " ${num_chunks}
echo
......
#!/bin/bash
export timecmd="$(which time)"
cd ${input_folder}
find "${gigi_split_prefix%/*}" -name "chunk_0*"
if [[ -n $(find "${gigi_split_prefix%/*}" -name "chunk_0*") ]]
then
echo "chunk_0.geno exists, file is most likely already split, if it is not, then remove the existing chunks from ${gigi_split_prefix%/*} and try again"
else
${gigi_split} "${param_file##*/}" "${num_chunks}" "${gigi_split_prefix}"
mkdir -p "${output_folder}/${run_name}/STATS"
$timecmd -o "${output_folder}/${run_name}/STATS/time${i}.log" -f'memory in kilobytes %M real %e user %U sys %S command %C' "${gigi_split}" "${param_file##*/}" "${num_chunks}" "${gigi_split_prefix}"
echo "Split exit status: " "$?"
fi
cd "$parent_path"
......
#!/bin/bash
timecmd="$(which time)"
cd "${input_folder}"
#cd "${gigi_split_chunks_folder}"
......@@ -19,7 +17,6 @@ do
echo "file ${output_folder}/${run_name}/gigi_output/${i}/impute.geno already exists."
else
mkdir -p "${output_folder}/${run_name}/gigi_output/${i}"
mkdir -p "${output_folder}/${run_name}/STATS"
echo $(pwd)
$timecmd -o "${output_folder}/${run_name}/STATS/time${i}.log" -f'memory in kilobytes %M real %e user %U sys %S command %C' "${gigi}" "${file}" -outD="${output_folder}/${run_name}/gigi_output/${i}" & pids+=("$!")
echo "$!"
......
......@@ -16,8 +16,8 @@ if [ "$merge_status" -eq 0 ]
then
echo "COMPLETED SUCCESSFULLY!"
echo "Merged files are located at: " "${output_folder}/${run_name}/gigi_output"
rm -R "${output_folder}"/"${run_name}"/gigi_output/*/
rm -R "${output_folder}"/"${run_name}"/split_output/
# rm -R "${output_folder}"/"${run_name}"/gigi_output/*/
# rm -R "${output_folder}"/"${run_name}"/split_output/
cat "${output_folder}"/"${run_name}"/STATS/time* > "${output_folder}"/"${run_name}"/STATS/stats
rm "${output_folder}"/"${run_name}"/STATS/time*
else
......
Scripts to run GIGI with multiple threads by splitting the input and mergin the output.
# Run GIGI Split and MergeScripts
#### Runs GIGI with multiple threads by splitting the input and merging the output.
Usage: run_GIGI parameter_file -o [OUTPUT FOLDER] -n [RUN NAME] -t [THREADS] -m [MEMORY IN MB]
Examples:
```bash
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt #Output in the current folder with no run name identifying subfolder, threads and memory determined automatically
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -o ./OUTPUTS -n test_run #Output in ./OUTPUTS/test_run
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -t 2 #Limit to only 2 threads (and hence two chunks)
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -m 1000 #Limit memory use to 1 GB, not yet implemented, asking for both thread and memory limits may result in impossible scenarios.
run_GIGI ./INPUTS/Sample_Input/param-v1_06.txt -mt 1000 2 #Limit memory use to 1 GB and 2 threads, not yet implemented, asking for both thread and memory limits may result in impossible scenarios.
```
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment