Tag
#dataset
56 repositories
Repos
Cryptocurrency historical price data library in Python. Data from https://coinmarketcap.com.
This repository contains 47,398 smart contracts extracted from the Ethereum network.
Datasets for evaluating smart contract security analysis tools ( continuously updating... )
SB Curated is a curated dataset of Solidity smart contracts annotated with tagged vulnerabilities. The dataset was created to evaluate the accuracy of automated analysis tools.
An R package that provides functions to retrieve historical and current cryptocurrency market data from CoinMarketCap's API. It allows users to access prices, exchange details, and global market information for various tokens.
Additional material for paper: Pump and Dumps in the Bitcoin Era: Real Time Detection of Cryptocurrency Market Manipulations, ICCCN '20
Easy-to-use cryptocurrency trading strategy simulator and backtester
248k Cryptocurrency news fetched from Cryptopanic.com
Python API for accessing Lake high frequency tick trades & order book data
Sequence-based Target Coin Prediction for Cryptocurrency Pump-and-Dump (SIGMOD 23)
A curated list of awesome smart contract datasets
The World's Largest Decentralized AGI Multimodal Dataset
📈 Free crypto market data (worth $500+/mo) for ML & research. Star ⭐ to keep it free!
Chartalist.org. Sponsored by the Canadian NSERC Discovery Grant RGPIN-2020-05665: Data Science on Blockchain and the National Science Foundation of USA under award number ECCS 2039701 Blockchain Graphs as Testbeds of Power Grid Resilience and Functionality Metrics.
[ICSE'26] FORGE: An LLM-driven Framework for Large-Scale Smart Contract Vulnerability Dataset Construction
Dataset containing source code and deployed bytecode for Solidity Smart Contracts that have been verified on Etherscan.io, along with a classification of their vulnerabilities according to the Slither static analysis framework.
A collection of json files used to automatically create models at https://charts.coinmetrics.io/formulas/
Aka the SolPunks Files. Punks datasets, tools, truth, etc. are all inside.
Open, daily-updated AHR999 (BTC hoarding index / 囤币指标) dataset — JSON + CSV + Astro dashboard, self-computed from Binance BTCUSDT closes.
:bookmark: Mapping of publicly known Ethereum addresses (name tags) as seen on https://etherscan.io/
The dataset builder script extracts all the relevant block information from the Bitcoin Blockchain through Mempool.space's public API. The data is stored in a .csv file, facilitating its use in data science and machine learning projects.
Filtering and ranking all of 5478 states in tic-tac-toe for efficient evaluation on the hardest ones
Additional material for paper: The Doge of Wall Street: Analysis and Detection of Pump and Dump Cryptocurrency Manipulations, TOIT '23
Decentralized Data Exchange Protocol to Unlock Data for Artificial Intelligence: Ocean protocol with smart contracts deployed on Ethereum network.
Hybrid Dataset Construction for AI-Driven Validator Selection in Proof-of-Stake Blockchain Networks
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️💾️📜️ The sourceCode:Fantom category for AI2001, containing Fantom programming language datasets
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️💾️📜️ The sourceCode:Motoko category for AI2001, containing Motoko programming language datasets
Inspired by a Kaggle hackathon loss, The Keginator was built to turn frustration into speed. It cleans messy CSVs, PDFs, MP3s, and WAVs, transcribes & chunks audio, and logs immutable hashes on Solana, giving your datasets trustworthy, verifiable integrity in seconds.
The dataset builder script extracts Bitcoin's Lightnining Network statistics through Mempool.space's public API. The data is stored in a .csv file, facilitating its use in data science and machine learning projects.
This GitHub repository mirrors the original project hosted at GitLab. Syncing occurs within 5 minutes of updates to the main repository. ⤵
A Library for Managing your Connection to Different DataSources . Still in Alpha.please be patient
Dataset with Historical Avalanche Incidents in Tirol
This project enables rusty-blockparser user to manufacture the csv files into a ML dataset.
📈 Downloads history data from Binance and saves it to CSV 📈
Artifact bundle (datasets + analysis code) for the EMSE paper on smart contract upgradeability: event-derived version lineages, classification utilities, and reproducible notebooks.
A decentralized data marketplace built on Sui with pricing, provenance, and ownership verification powered by Walrus 🦭 and Seal 🔐.
Real-time Global Semiconductor Supply Chain API. Programmatic JSON-LD feed for Nvidia H100, B200, TSMC 3nm/5nm wafer pricing, ASML EUV capacity & silicon spot markets. Optimized for autonomous AI agents, LLM-scrapers & hedge fund quant-trading. Supports HTTP 402 M2M Solana settlement. Essential data for Bloomberg, Reuters & supply chain disruption.