Understanding the Geo Provenance Dataset
- Giuliana Bruni
- 2 days ago
- 2 min read

We are excited to introduce the latest in open source software intelligence from SCANOSS: the Geo Provenance dataset. Have you ever wondered where your open source components came from? With our new Geo Provenance dataset, you can now uncover the geographical origin of contributors to OSS projects.
Geo Provenance intelligence plays a crucial role in understanding the origin of the code in open source projects. This dataset allows organisations to assess the credibility of each contribution, while also creating an easy-to-integrate filter to apply customisable geopolitical policies within the development pipeline.
The SCANOSS Geo Provenance dataset provides two types of intelligence: curated_locations and declared_locations. Curated_locations are generated through a combination of AI-powered analysis and human validation, streamlining the curation process while ensuring accuracy and reliability. In contrast, declared_locations consist of self-reported geographical information provided by contributors to open source projects. This comprehensive dataset offers valuable insights into the global distribution of project contributors, enhancing transparency and understanding.
You can easily consume this dataset directly in your command line interface through our scanoss-py client. For example, to receive detailed information about contributor locations directly in your terminal, simply run the command:
scanoss-py comp prv--key <api_key> --purl <package_url>
Take advantage of this dataset to configure automated policies and enhance your existing workflow through our integrations in your CI/CD process.
Understanding the geographic distribution of contributors can offer significant advantages. It can enhance transparency, allowing teams to better understand the origins of the third-party code they are using and the human dependencies of that code. For legal teams, knowing the locations of contributors can help organisations navigate open source compliance and manage risks associated with international collaboration.
We invite you to explore the SCANOSS Geo Provenance dataset and leverage its insights to improve your projects and internal decision-making. Learn more about how SCANOSS can support your business needs with our datasets.