天池数据集

VALID: Nationwide Arrival Detection via Bluetooth

描述

VALID: Nationwide Arrival Detection via Bluetooth.

数据列表

  • 数据名称上传日期大小删除下载
  • VALID_Courier_feedback_data.zip2021-08-129.95MB
  • VALID_Courier_device.zip2021-08-1251.88MB
  • VALID_Merchant_device.zip2021-08-12172.01MB
  • VALID_Courier_report.zip2021-08-12788.06MB
  • VALID_Sensing.zip2021-08-125.25GB

文档

1. Overview

Large scale mobile sensing network is at the intersection of multiple research communities, including mobile computing, spatial-temporal data mining, and human-system synergy. However, the progress of studies on large-scale mobile sensing networks does not meet industrial expectations. One of the main reasons is that there are few well-organized datasets that provide large amounts of labeled sensing data with human feedback to support related studies.

In a previous dataset (aBeacon), we released a repository composed of citywide BLE sensing, location trace, and manual report data.

In this dataset, we release a repository composed of nationwide BLE sensing based on smartphone advertising and scanning, human report as labels, and human response to system intervention as feedback.

Details on the data and the systems can be found in our papers on the systems.

[NSDI' 21] Yi Ding, Ling Liu, Yu Yang, Yunhuai Liu, Tian He, Desheng Zhang. From Conception to Retirement: a Lifetime Story of a 3-Year-Old Operational Wireless Beacon System in the Wild. In USENIX NSDI 2021.

[SIGCOMM' 21] Yi Ding, Yu Yang, Wenchao Jiang, Yunhuai Liu, Tian He, Desheng Zhang. Nationwide Deployment and Operation of a Virtual Arrival Detection System in the Wild. In ACM SIGCOMM 2021.

2. Data Description

VALID provides the BLE sensing, manual report data, and courier feedback data of 55,000 couriers at 113,000 merchant locations in ten cities during one month.

2.1 Data Format

The VALID dataset includes:

  • VALID sensing data (VALID_sensing.zip),
  • VALID courier device data (VALID_courier_device.zip)
  • VALID merchant device data (VALID_merchant_device.zip)
  • VALID courier report data (VALID_courier_report.zip)
  • VALID Courier feedback data (VALID_courier_feedback.zip)

2.1.1 VALID sensing data

The downloaded VALID sensing data should have the following form:

VALID_sensing.zip

├──Beacon_data_city_1.csv.zip

│ ├──Beacon_data_city_1_day_1.csv.zip

│ ├──Beacon_data_ciity_1_day_2.csv.zip

│ ├──…

├── Beacon_data_city_2.csv.zip

│ ├──Beacon_data_city_2_day_1.csv.zip

│ ├──Beacon_data_city_2_day_2.csv.zip

│ ├──…


Each row of the data records a Bluetooth broadcast monitored on a courier's smartphone.

  • Beacon_data_city_X_day_Y >> city_id is the ID of the city.
  • Beacon_data_city_X_day_Y >> day is the day that the data is collected.
  • Beacon_data_city_X_day_Y >> courier_id_hash is the hashed courier ID.
  • Beacon_data_city_X_day_Y >> shop_id_hash is the hashed merchant ID.
  • Beacon_data_city_X_day_Y >> beacon_id_hash is the hashed beacon ID.
    Note that courier_id_hash, shop_id_hash and beacon_id_hash can be used as the foreign keys in both this dataset and the aBeacon dataset.
  • Beacon_data_city_X_day_Y >> hour is the time (hour) when the BLE beacon data is collected.
  • Beacon_data_city_X_day_Y >> minute is the time (minute) when the BLE beacon data is collected.
  • Beacon_data_city_X_day_Y >> second is the time (second) when the BLE beacon data is collected.
  • Beacon_data_city_X_day_Y >> rssi is the received signal strength indicator on the courier's smartphone.

2.1.2 VALID courier device data

The downloaded VALID courier device data should have the following form:

VALID_courier_device.zip

├──Courier_device_data.csv

Each row of the data records a courier’s smartphone device hardware information.

  • Courier_device_data >> city_id is the ID of the city.
  • Courier_device_data >> day is the day that the device information is collected, since couriers may change device in different days.
  • Courier_device_data >> courier_id_hash is the hashed courier ID.
  • Courier_device_data >> os_version is the OS version of the courier’s smartphone.
  • Courier_device_data >> phone_mode is the phone mode of the courier's smartphone. e.g., "COL-AL10" for HUAWEI HONOR 10.

2.1.3 VALID merchant device data

The downloaded VALID merchant device data should have the following form:

VALID_merchant_device.zip

├──Merchant_device_data.csv

Each row of the data records a merchant’s smartphone device hardware information.

  • Merchant_device_data >> city_id is the ID of the city of the merchant.
  • Merchant_device_data >> day is the day that the device information is collected, since merchants may change device in different days.
  • Merchant_device_data >> shop_id_hash is the hashed merchant ID.
  • Merchant_device_data >> beacon_id_hash is the hashed beacon ID.
  • Merchant_device_data >> manufacturer is the manufacturer of the merchant's smartphone. e.g., "HUAWEI".
  • Merchant_device_data >> brand is the brand of the merchant's smartphone. e.g., "HONOR" for HUAWEI HONOR.
  • Merchant_device_data >> model is the phone mode of the merchant's smartphone. e.g., "COL-AL10" for HUAWEI HONOR 10.

2.1.4 VALID Courier report data

The downloaded VALID sensing data should have the following form.

VALID_courier_report.zip

├── Courier_report_data_city_1.csv.zip

│ ├── Courier_report_data_city_1.csv

├── Courier_report_data_city_2.csv.zip

Each row of the data records the four timestamps from the courier's report for a delivery order.

  • Courier_report_data >> city_id is the ID of the city of the merchant.
  • Courier_report_data >> day is the day of order.
    Note that day is the foreign key to join data in the VALID sensing data.
  • Courier_report_data>> courier_id_hash is the hashed courier ID.
  • Courier_report_data >> shop_id_hash is the hashed merchant ID.
  • Courier_report_data >> acceptance_hour is the time (hour) that courier accept the order.
  • Courier_report_data >> acceptance_minute is the time (minute) that courier accept the order.
  • Courier_report_data >> acceptance_second is the time (second) that courier accept the order.
  • Courier_report_data >> arrival_hour is the time (hour) that courier report arrival at the merchant in the courier's APP.
  • Courier_report_data >> arrival_minute is the time (minute) that courier report arrival at the merchant in the courier's APP.
  • Courier_report_data >> arrival_second is the time (second) that courier report arrival at the merchant in the courier's APP.
  • Courier_report_data >> pickup_hour is the time (hour) that courier report pickup of the order at the merchant in the courier's APP.
  • Courier_report_data >> pickup_minute is the time (hour) that courier report pickup of the order at the merchant in the courier's APP.
  • Courier_report_data >> pickup_second is the time (hour) that courier report pickup of the order at the merchant in the courier's APP.
  • Courier_report_data >> delivery_hour is the time (hour) that courier report delivery complete in the courier's APP.
  • Courier_report_data >> delivery_minute is the time (minute) that courier report delivery complete in the courier's APP.
  • Courier_report_data >> delivery_second is the time (second) that courier report delivery complete in the courier's APP.

2.1.5 VALID Courier feedback data

The downloaded VALID sensing data should have the following form.

VALID_courier_feedback.zip

├── Courier_feedback_data_city_1.csv.zip

├── Courier_feedback_data_city_2.csv.zip

Note that since the intervention notification is only tested in several cities, we can only provide the feedback data in 9 cities.

Each row of the data records the four timestamps from the courier's report for a delivery order.

  • Courier_feedback_data>> courier_id_hash is the hashed courier ID.
  • Courier_feedback_data >> shop_id_hash is the hashed merchant ID.
  • Courier_feedback_data >> day is the day of order.
    Note that the day is the foreign key to join data in the VALID sensing data.
  • Courier_feedback_data >> acceptance_hour is the time (hour) that courier accept the order.
  • Courier_feedback_data >> acceptance_minute is the time (minute) that courier accept the order.
  • Courier_feedback_data >> acceptance_second is the time (second) that courier accept the order.
    Note that acceptance_hour, acceptance_minute, and acceptance_second can be used as the foreign keys in the courier report data this dataset.
  • Courier_feedback_data >> feedback_hour is the time (hour) that the courier responds to the intervention notification in the APP.
  • Courier_feedback_data >> feedback_minute is the time (minute) that the courier responds to the intervention notification in the APP.
  • Courier_feedback_data >> feedback_second is the time (second) that the courier responds to the intervention notification in the APP.
  • Courier_feedback_data >> feedback is the courier's responds to the intervention notification in the APP (0 indicates Try-Later and 1 indicates Confirm).
  • Courier_feedback_data >> distance_to_merchant is the distance from the courier's location (from smartphone GPS) and the shop (from POI information) when the courier responds to the intervention notification in the APP.
  • Courier_feedback_data >> gps_measure_offset is the time difference between the feedback time and the GPS measurement time, i.e., feedback_timestamp – GPS_measurement_timestamp in seconds.

2.1.6 City ID Mapping

1 2 3 4 5 6 7 8 9 10
Shanghai Hangzhou Shenzhen Beijing Weifang Ganzhou Suining Xining Dingan Langfang

2.1.7 Date Mapping with aBeacon dataset

The courier ID and shop ID in VALID dataset can be used as foreign key to join that in aBeacon data. However, the date information of VALID dataset and aBeacon dataset is NOT correlated.

3. Dataset Citation

Please cite the following paper for the use of this dataset in any publications.
Yi Ding, Yu Yang, Wenchao Jiang, Yunhuai Liu, Tian He, Desheng Zhang. Nationwide Deployment and Operation of a Virtual Arrival Detection System in the Wild. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pp. XX-XX. 2021.

If you have published papers using our dataset, please send to tianchi_open_dataset@alibabacloud.com with the publication URL. We will make statistic about the citation and contact you to send Tianchi gift.

4.Privacy Protection

To protect the privacy from attacks. We take the following measures.
(1) Identity information (courier/merchant) are hashed.
(2) The absolute date is removed and we only keep the hour/minute/second. Note that the data released is in consecutive 30 days, but not necessarily a whole month. For example, it can be from Apr.10th to May 10th.

5.Potential Research Topics

(1) For VALID sensing data, we envision the following research topics:
a) Bluetooth network topology;
b) Relation between Bluetooth-based detection and courier report (with courier location report data);
c) Relation between physical BLE beacons and smartphone-base beacons (with aBeacon dataset).

(2) For VALID courier device and merchant device data, we envision the following research topics:
a) Impact of sender device types on Bluetooth advertising;
b) Impact of receiver device types on Bluetooth scanning.

(3) For VALID courier report data, we envision the following research topics:
a) Order scheduling strategy study.

(4) For VALID courier feedback data, we envision the following research topics:
a) Human behavior changes due to system interventions;
b) Human-system synergy.

6. Dataset Tags

Mobile Computing, Wireless Sensing, Network Design and Implementation,

Spatial-Temporal Data Ming, Order Dispatching, Order Scheduling

7. License

The dataset is distributed under the CC BY-NC 4.0 license.

目录

1. Overview

2. Data Description

2.1 Data Format

2.1.1 VALID sensing data

2.1.2 VALID courier device data

2.1.3 VALID merchant device data

2.1.4 VALID Courier report data

2.1.5 VALID Courier feedback data

2.1.6 City ID Mapping

2.1.7 Date Mapping with aBeacon dataset

3. Dataset Citation

4.Privacy Protection

5.Potential Research Topics

6. Dataset Tags

7. License