Home > Data Lab > Data Set
  • Query_User_Item_Ai4b

    Providers : Taobao Search

    Posted : 2017.06.16

    #Participants : 0

Data Set Description

Document (You can download after you login)

Format

dataset1

download

dataset2

download

1.Logs of seven days from Taobao search engine. Includes user and click. Abnormal users are filtered out. The following is the schema of the data set.
2.Using the dataset of first nine days as training data, and the dataset of 10th day as test data, we predict whether the sample is clicked or not.
3. We use logistic regression as baseline model and AUC is 0.5979.

item_id STRING comment 'item ID'
usertag STRING comment 'buyer tag'
query STRING comment 'query'
ipv BIGINT comment 'click times'
ds STRING comment 'date'


--------------------------------------------------------以下是中文描述--------------------------------------------------------

数据集描述

1. 此数据为淘宝无线搜索日志,一共10天日志,去除了行为异常的用户。数据包含用户的点击行为。
2.以前9天的数据为训练集,以第10天的数据为测试集,预测每一个样本点击的概率。
3.baseline使用LR模型,测试集AUC0.5979

item_id STRING comment '商品ID'
usertag STRING comment '买家用户标签'
query STRING comment '查询词'
ipv BIGINT comment '点击数'
ds STRING Comment '日期'