(howto:overview)=
# Tutorials

The following chapter presents a set of tutorials to get you started with our technology. It serves as a quick start into the project.

Here's an overview of all available tutorials:

- **[Start: Start working with OpenWebSearch.eu Data](0_first_steps.md)** This tutorial introduces the **owilix** command-line tool for accessing, downloading, and querying datasets from the [Open Web Index](https://openwebsearch.eu/the-project/research-results/open-web-search-book/). 


- **[Tutorial 0: How to create an index locally](a_local_index.md)** - Shows how to set up the indexing pipeline locally using Docker, from crawling websites to creating searchable indexes.

- **[Tutorial 1: Data Download with owilix](b_data_download_with_owilix.md)** - Demonstrates how to use the `owilix` command-line tool to access, download, and query Open Web Index datasets.

- **[Tutorial 2: How to use MOSAIC](c_mosaic.md)** - Explains how to set up and run the MOSAIC search engine for searching through index data with a REST API and web interface.

- **[Tutorial 3: Slicing Data with owilix](d_data_slicing_with_owilix.md)** - Covers how to create custom data subsets by running SQL queries against parquet files using `owilix`.

- **[Tutorial 4: How to Analyse OWI Data](e_owi_data_analytics.ipynb)** - Shows how to pull OWI data and analyze it using Python, pandas, and Jupyter notebooks for statistical analysis.

- **[Tutorial 5: How to download index files using the Lexis Platform](f_lexis_datadownload.md)** - Provides guidance on downloading index files through the Lexis Platform (content coming soon).

- **[Tutorial 6: Pushing Data to OpenSearch](g_data_push_to_opensearch.md)** - Demonstrates how to use `owilix` to push web data into OpenSearch clusters for internal search augmentation.

- **[Tutorial 7: Filter Sites and Push to OpenSearch](h_filter_sites_and_push_to_opensearch.md)** - Shows how to build an OpenSearch index for specific websites using URL filtering with `owilix`.

- **[Tutorial 8: How to Evaluate Components with TIRA/TIREx](i_evaluation_with_tira.md)** - Covers how to evaluate software components using the TIRA and TIREx evaluation platforms.

- **[Tutorial 9: How to Develop New Modules for Resilipipe](j_develop_resilipipe_modules.md)** - Explains how to create custom processing modules for the Resilipipe WARC processing pipeline.

- **[Tutorial 10: Hosting the OWI on your own S3](l_hosting_owi_on_your_s3.md)** - Shows how to host Open Web Index data on your own S3 bucket for faster access and querying.

- **[Tutorial 11: Building an OWI Lake](m_building_an_owi_lake.md)** - Shows how to integrate daily index shards into a datal lake as staging area for indexing and analyitcs.

- **[Tutorial 12: Uploading your own dataset](n_how_to_upload_your_own_datasets.md)** - Shows how to upload your own non-OWI dataset to share resources related to search, NLP and AI
- 
- **[Tutorial 99: Data Upload](o_data_upload.md)** - Covers how to upload and share data using the `owilix` command-line tool (experimental feature).

- **[Tutorial 14: Finetuning an LLM with OWI data using the LUMI supercomputer](./p_finetuning_tutorial.ipynb)** - Shows how to utilize Open Web Index data for LLM finetuning using LUMI supercomputer.
