|  |   | 
| (10 intermediate revisions by the same user not shown) | 
| Line 3: | Line 3: | 
|  | {{Beta Version}} |  | {{Beta Version}} | 
|  | 
 |  | 
 | 
|  | The SICdb dataset is provided in compressed .csv files, the minute values are even more consolidated. Refer to the [[Main Page]] for detailed description of files. |  | The SICdb dataset is provided in compressed .csv files, the minute values are even more consolidated. Refer to the [[File List]] for detailed description of files.   | 
|  | 
 |  | 
 | 
|  | The SICdb dataset contains billions of entries, therefore building up a database may present a challenge. Therefore a 'as simple as possible' solution is provided. Our solution, we called it [[SICdb Environment]], provides a fully preconfigured and fast environment to access, explore and export SICdb data. Refer to the Quick Start chapter if you know how the commandline and docker is working, skip to the Detailed Instructions for a more detailed reference. |  | The SICdb dataset contains billions of entries, therefore building up a database may present a challenge.   | 
|  | 
 |  | 
 | 
|  | == Quick Start == |  | == Usage Examples == | 
|  | 
 |  | 
 | 
|  | Just like other ICU datasets SICdbis huge.Expect that your pc need a significant amount of time to process!
 |  | Scripting or Database Query examples for using SICdb are found at https://github.com/nrodemund/sicdb/tree/main/Examples | 
|  | 
 |  | 
 | 
|  | The database can be built up using [http://www.docker.com Docker]. After install navigate into the folder containing all the data and run "docker compose up". When the environment is running open http://localhost:5000 to install the dataset. The provided environment ist fully preconfigured, just press start and wait. Due to the vast size this may take 4-16 hours*. When install is finished the server has to be restarted, you may do this by reloading the page and then press the shiny restart button.
 |  | == Relational Database Import == | 
|  | 
 |  | 
 | 
|  | *) We work on a solution toprovide a fully indexed database. Until now we have not found a legally safe way of distribution (repository).
 |  | Refer to https://github.com/nrodemund/sicdb/tree/main/Import for some scripts for importing the dataset into a relational database system. | 
|  |   |  | 
|  | == Detailed Instructions (Windows)==
 |  | 
|  |   |  | 
|  | * Download Data
 |  | 
|  | * Install [http://www.docker.comDocker]
 |  | 
|  | * [[Open commandline]] as admin (*)
 |  | 
|  | * [[Navigate to data folder]]
 |  | 
|  | * Run "docker compose up" 
 |  | 
|  | * When command is finished wait for another 30 seconds, as the mysql server will need some time to configure
 |  | 
|  | * Open http://localhost:5000
 |  | 
|  | * Select install and start installation
 |  | 
|  |   |  | 
|  | The installation will need some time, expect several hours. The process can be interrupted and will continue where it was stopped.
 |  | 
|  |   |  | 
|  | If you need any help with installation please feel free to [[contact us]]!
 |  | 
|  |   |  | 
|  | == Issues and Troubleshooting ==
 |  | 
|  |   |  | 
|  | Generally the environment should run on all common operating systems on all machines (We recommend >=16gb RAM and a modern multicore CPU).
 |  | 
|  |   |  | 
|  | There are some issues to be expected when the environment is executed on a "standard" windows machine. We recommend using WSL2, but there are someknown bugs and issues, which have not been resolved as of 10/2022. We have not faced any issues on a linux machine.
 |  | 
|  |   |  | 
|  |   |  | 
|  | * Installing the database will cause WSL2 to reserve 50% of your machines RAM and it will not be released automatically. It is expected that this will be patched in future, but fornow, after installing the dataset, you may either close the container and run "wsl --shutdown" in admin powershell (Docker Desktop will automatically restart wsl) or restart your computer.
 |  | 
|  |   |  | 
|  | * Due to aknown bug in WSL2 it may be not possible to run the environment on a software-mounted drive. This can cause issues when trying to launche SICdb environment from a VeraCrypt/TrueCrypt drive. BitLocker is fully supported as far as we know.
 |  | 
|  | * Due to a known 'bug' concerning the way windows manages hosts it my be that SICdb environment is not reachable on localhost:5000 or does not start at all. This can be resolved by closing Docker Desktop, run as administrator and then restart engine. The restart engine button is found in the systemtray (right bottom on windows machines), right click docker icon->restart.
 |  | 
Introduction
SICdb dataset and the documentation, as of 04/24, in active development. We try to improve our project as fast as we can! Please contact us for every problem with data or software you find! We'd love to find some motivated researchers attain as much as possible from our dataset. Contact us if you face any issues!
 
The SICdb dataset is provided in compressed .csv files, the minute values are even more consolidated. Refer to the File List for detailed description of files. 
The SICdb dataset contains billions of entries, therefore building up a database may present a challenge. 
Usage Examples
Scripting or Database Query examples for using SICdb are found at https://github.com/nrodemund/sicdb/tree/main/Examples
Relational Database Import
Refer to https://github.com/nrodemund/sicdb/tree/main/Import for some scripts for importing the dataset into a relational database system.