Open Source Tools
All software within the CN project is open source.
CN-Editor
The editor is a modular tool for recording, researching and the analysis of ancient coins. It can easily be adapted to other object types which have images and inscriptions und are available in large numbers.
https://github.com/telota/corpus-nummorum-editor
The training videos for CN Editor can be downloaded here. Currently, the following topics are covered in the training courses:
The user interface
Creation of data sets
Importing coin data from other collections
Image editing functions
Natural Language Processing (NLP) and Image Recognition (IR)
The two following links make some steps of our progress in the field of Artificial Intelligence (AI) applied to the coins of the Corpus Nummorum (CN) project accessible. These are our Natural Language Processing (NLP) and Image Recognition (IR) approaches. We provide them in form of Colab notebooks on the GitHub platform. An instruction for the use of both notebooks is attached. Only a modern browser is required to run the notebooks. Previous knowledge in the field of AI is not needed. The “Open in Colab” button on the GitHub page leads directly to the notebooks.
With the NLP Notebook, the recognition of entities such as "Apollon" can be tested on individual coin descriptions from the CN database or on your own descriptions:
The NLP Notebook can be used to test the recognition of individual entities such as "Apollo" on individual coin descriptions from the CN database or your own descriptions:
https://github.com/Frankfurt-BigDataLab/NLP-on-multilingual-coin-datasets
The IR Notebook is used to identify CN types or mints based on coin images. Examples or self-selected images from the CN collection can be tested here:
https://github.com/Frankfurt-BigDataLab/IR-on-coin-datasets
Coin Image Dataset
This dataset is a collection of ancient coin images from three different sources: the Corpus Nummorum (CN) project, the Münzkabinett Berlin and the Bibliothèque nationale de France, Département des Monnaies, médailles et antiques. It covers Greek and Roman coins from ancient Thrace, Moesia Inferior, Troad and Mysia. This is a selection of the coins published on the CN portal (due to copyrights). The main purpose for this dataset in the CN project is the training of Machine Learning based Image Recognition models. With the publication of the dataset we would like to invite you to try out your own ideas and models on our coin data.
Download from Zenodo
Coin Image Dataset
This dataset is a collection of ancient coin images from three different sources: the Corpus Nummorum (CN) project, the Münzkabinett Berlin and the Bibliothèque nationale de France, Département des Monnaies, médailles et antiques. It covers Greek and Roman coins from ancient Thrace, Moesia Inferior, Troad and Mysia. This is a selection of the coins published on the CN portal (due to copyrights). The main purpose for this dataset in the CN project is the training of Machine Learning based Image Recognition models. With the publication of the dataset we would like to invite you to try out your own ideas and models on our coin data.
Download from Zenodo
Data Quality Tool
The main idea of this tool is to execute pre-defined SPARQL queries (rules) against a SPARQL endpoint to identify data quality issues (inconsistencies within the data, missing values and outliers). The results of these SPARQL queries are compiled and listed in an Excel file (a spreadsheet for each query and an overview spreadsheet). In the resulting Excel file, domain experts can enter comments about the status of the found issue (could be no error or to identify reasons for inconsistencies or missing data). This commented Excel file can be used for the next run of the data quality check and the tool will retain: a) the date an issue was first reported and b) the comments made by the domain experts. We implemented the tool based on RDF/SPARQL to allow better reuse. The tool can be used by anyone who makes their data available via a SPARQL endpoint. Groups sharing the same model can reuse and share their generated sPARQL queries. The tool comes with the rules we have generated for our Corpus Nummorum coin data, based on the Nomisma.org ontology.
Imagines Nummorum VLM Index Card Data Extraction Pipeline
A tool for the automated analysis of index cards using a Vision-Language Model (Qwen2.5-VL). The system performs a multi-stage image analysis, classifies images, recognizes handwritten content, and extracts structured data.