Какая задача решается № 1 Тема Physical Unclonable Functions Data Set https://archive.ics.uci.edu/ ml/datasets/Physical+Unclo nable+Functions 2 Kitsune Network Attack Dataset Data Set https://archive.ics.uci.edu/ ml/datasets/Kitsune+Netwo rk+Attack+Dataset 3 NYC Parking Tickets https://www.kaggle.com/ne w-york-city/nyc-parkingtickets 4 N-BaIoT Dataset to Detect IoT Botnet Attacks https://www.kaggle.com/mk ashifn/nbaiotdataset#1.benign.csv 5 DDoS Dataset https://www.kaggle.com/de vendra416/ddosdatasets#unbalaced_20_80 _dataset.csv The dataset is generated from Physical Unclonable Functions (PUFs) simulation, specifically XOR Arbiter PUFs. PUFs are used for authentication purposes. A cybersecurity dataset containing nine different network attacks on a commercial IP-based surveillance system and an IoT network. The dataset includes reconnaissance, MitM, DoS, and botnet attacks. The NYC Department of Finance collects data on every parking ticket issued in NYC (~10M per year!). This data is made publicly available to aid in ticket resolution and to guide policymakers. When are tickets most likely to be issued? Any seasonality? Where are tickets most commonly issued? What are the most common years and types of cars to be ticketed? This dataset addresses the lack of public botnet datasets, especially for the IoT. It suggests real traffic data, gathered from 9 commercial IoT devices authentically infected by Mirai and BASHLITE. DDoS Balanced & Unbalanced Datasets. There are no latest data sets found exclusively for DDoS in the Public domain, though IDS data sets available. So, I have extracted DDoS flows from other public IDS datasets {CSE-CIC-IDS2018-AWS, CICIDS2017, CIC DoS data set Число строк >1 млн Число столбцов 129 Объем Характеристики >1GB Атрибутивные Характеристики Integer Связанные задачи Classification Real Classification, Clustering, CausalDiscovery Real Classification, Clustering Multivariate >1 млн 115 >1GB Multivariate, Sequential, TimeSeries >1 млн 51 >1GB >1 млн 115 >1GB Multivariate, Sequential >1 млн 85 >1GB Best FE on clean and filtered d 6 https://www.kaggle.com/ica rofreire/best-filter-andfeatureengineering#final_tr ain2.csv ata Dados_Brasil 7 8 9 10 https://www.kaggle.com/ca mposfabio/dadosbrasil#Educacao_Basica_2 018%20%20Docentes_Sudeste.csv KASANDR Data Set http://archive.ics.uci.edu/ml /datasets/KASANDR DeepSat (SAT-4) Airborne Dataset https://www.kaggle.com/cra wford/deepsatsat4?select=X_test_sat4.cs v Complete 2017 Program Year Open Payments (2016)}. To introduce more variance, DDOS data is extracted from different IDS datasets which were produced in different years and different experimental DDoS traffic generation tools. The extracted DDOS flows are combined with "Benign " flows which are extracted separately from the same base dataset and made into a single largest dataset. The two CSV files here are the train and test data in Kaggle's Ion Switching Competition with drift removed and filter with Kalman filter to reduce noise. >1 млн 80 >1GB This is a data set with information on basic education in Brazil in 2018. >1 млн 132 >1GB KASANDR is a novel, publicly available collection for recommendation systems that records the behavior of customers of the European leader in eCommerce advertising, Kelkoo. 500,000 image patches covering four broad land cover classes >1 млн 2158859 >1GB >1 млн 3136 >1GB A complete set of all data from the 2017 Program Year, which includes >1 млн 75 >1GB Integer Multivariate Causal-Discovery Dataset https://www.cms.gov/Open Payments/Explore-theData/Dataset-Downloads PAMAP2 Physical Activity Monitoring Data Set 11 12 13 14 15 https://archive.ics.uci.edu/ ml/datasets/PAMAP2+Phys ical+Activity+Monitoring Los Angeles Building and Safety Permits https://www.kaggle.com/cit yofLA/los-angeles-buildingand-safety-permits Predict Outcome of Pregnancy https://www.kaggle.com/raj anand/ahs-woman-1 SIFT10M Data Set https://archive.ics.uci.edu/ ml/datasets/SIFT10M Human Activity Recognition from Continuous Ambient Sensor Data Data Set https://archive.ics.uci.edu/ ml/datasets/Human+Activit y+Recognition+from+Conti nuous+Ambient+Sensor+D ata data reported about payments made from January 1 through December 31, 2017. The PAMAP2 Physical Activity Monitoring dataset contains data of 18 different physical activities, performed by 9 subjects wearing 3 inertial measurement units and a heart rate monitor. This is a dataset hosted by the city of Los Angeles. >1 млн >1 млн 65 <1GB This dataset contains data on Annual Health Survey. Is it possible to predict the pregnancy outcome (live birth/still birth/abortion)? >1 млн 201 <1GB In SIFT10M, each data point is a SIFT feature which is extracted from Caltech-256 by the open source VLFeat library. The corresponding patches of the SIFT features are provided. This dataset represents ambient data collected in homes with volunteer residents. Data are collected continuously while residents perform their normal routines. >1 млн 128 <1GB 52 <1GB Real Classification Integer Causal-Discovery Integer, Real Classification Multivariate, TimeSeries Multivariate >1 млн 37 >1GB Multivariate, Sequential, TimeSeries