Predicting E-commerce Item Sales With Web Environment Temporal Background
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 30 | |
Autor | ||
Lizenz | CC-Namensnennung - keine kommerzielle Nutzung 4.0 International: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/53695 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
| |
Schlagwörter |
11
00:00
Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:00
My name is Yihong Zhang. I'm from Osaka University. I will present our work, Public Predicting E-commerce Art in Sales-based Environment Template Planner, and people who work on this work, myself and Professor Takahiro Hara.
00:22
At OBEU, this talk will include background methodology and experimental evaluation, and then follow-up conclusion. So this is the background we are working on. It's an e-commerce sales prediction in the context of temporary sales.
00:48
Another name for temporary sales may be fresh sales. You are more familiar. Such kind of sales usually involves deep discount and limited time.
01:05
The benefits for predicting such sales include the better management of inventory, and then you can also decide a better time to start a sales
01:21
that can lead to success. factors affecting sales in this kind of temporary sales include the right product and promotion, right words for the campaign discretion, and the timing. And this last factor is what we are dealing with.
01:47
We say we are predicting sales, but it's not actually the number of sales we are predicting. We consider this prediction as a recommendation problem. In traditional recommendation problem, you have a number of users and number of items,
02:06
and you want to, given user, you want to bring the items based on the user's preference. So the user, depending on whether the user will buy the item or not.
02:23
In our program, we formulate it this way. Given a time, we bring item by Beijing name with temporary information. Basically, the user becomes the term and item is similar to the item in the recommendation system.
02:44
So one of these background we are getting the inspiration is the many words that use social media as a temporary background in prediction tasks. I'll briefly go through four of them. The first is stock market movement prediction.
03:05
People use the trees in social media that associated with particular stocks and sentiment analysis on them and to predict the stock movement.
03:20
There's also predictive prediction, sometimes also known as crime prediction. People use the trees, dividing into different topics and then classify the area that is more likely to have a crime or not. In relation with our prediction, the trees associated with each candidate are also used
03:49
and then for making a prediction. Epidemic ranking, people use trees that mentions the disease and the allocation to predict the spread of epidemic.
04:05
So in all these cases, we have a specific problem, but the social media is not directly associated with such a problem. But somehow, as a background, we can use the information in
04:21
the social media to predict something. So in this world, we try to predict item sales from social media and another source is the sold item in the same domain,
04:42
which is the income. So this figure shows our framework. We devised our method in four steps. The first step is to learn word embeddings using word from a large test corpus, such as Wikipedia. And then we predict the current temporal
05:07
embedding using past temporal embedding. This past temporal embedding contains two parts, the social media embedding and the sold item and discussion embedding.
05:22
And we use the past embedding of these two sources to predict the current embedding. And then in the third step, we have a new item and we obtain the embedding for this item. In the final step, the first step, we compare these two embeddings and we predict the sales
05:43
based on their cosine similarity. I think the first step and third step is easy to understand, so I will focus on the second step and fourth step. Let me start with the first step, which is predicting item sales based on cosine similarity.
06:07
We make assumption that the item is more consistent with the temporal background tends to have higher demand. And this consistency is calculated as the cosine similarity
06:21
between the item and the temporal background, both in vector forms. So the cosine similarity formula is shown in the middle. We have two sources, like I said, the social media and the sold item in the past.
06:45
And then we're doing the item at time t based on the value of the cosine similarity. So let's assume that the current temporal background is already obtained,
07:03
but this is obtained in the previous step, the second step. And here we also made an assumption that the consistency calculated using the current temporal background is the most accurate. However, if you think I want to predict the
07:25
item sales tomorrow, but we cannot obtain the temporal background tomorrow on the day, so we need to somehow predict the future temporal background using the past data. So this essentially becomes a time series prediction problem. There are already many
07:47
solutions. We examine two solutions, a naive solution, which is using the values in the previous time frame, and a more advanced solution, which is recurrent neural network.
08:06
Now let's move on to the experiment. Here's some information for the dataset. We have an e-commerce dataset that consists of 68,000 temporal deals between October 2016 and August
08:30
2017. On average, around 1,000 deals are available in a day. All these deals are normally active for a period between 7 to 14 days. These deals attracted 1.6 million
08:50
purchases, and each deal is also associated with a test description written in Japanese.
09:00
We also have a social media dataset. It consists of 11,000 accounts who involved in Japanese political discussion in Twitter. We collected past tweets from these accounts and aligned them to the period of the e-commerce data, and finally obtained 1.7
09:23
million tweets. Sign implementation details. The first that we remember is learning the word embedding using Word2Wet. The coverage we chose is Japanese Wikipedia, and the dimension is 50.
09:42
Test processing, we have the item description and the tweets, both written in Japanese, so we use this package called Kudomoji to do the tokenization and part of speech taking. For the IAA, we have one ASTM layer with 48 nodes and one free connected layer with 30 nodes,
10:06
and the loop period is three days. We have actually two tasks to evaluate. The first is predicting current temporal background using the past data, and the second is
10:24
predicting action sales. Here is the result for the first task, predicting current temporal embedding. The accuracy is used in MAE and IMSE. The smaller the volume, the more accurate the method. So using the previous one-day naive solution,
10:45
it's worse than using a more advanced solution, IN, so this is no surprise. Here's the result for the item sales prediction. I listed the evaluation for a number of
11:08
measures. The first is Nandor measure, and then the second is Nandor forest baseline measure, and then we have item now and tweet now. These are not
11:20
actual prediction because we use the data in the future. They cannot be obtained today. They are put here for the reference. I also have the item and the tweet for the previous one-day measure and the item tweet for the IN.
11:42
We can have some insights from these results. The first insight is that all the prediction measures are better than the Nandor measure, which means the temporal background embedding actually contains predictive hints for the item sales. And then the second insight is that
12:03
embedding achieves special results, showing that if current temporal embedding can be correctly predicted, the accuracy can be improved. Third insight, IN is slightly better than the previous measure, especially for the call A100. However,
12:26
yeah, comparing the tweet embedding and the past item embedding, tweet embedding is more predictive than the other embedding in terms of recall AK, but for the MAP AK, the item
12:45
embedding is actually better. I think that's because the measure pick up some special item images, the past item sales, and then push it to a higher position in the ring that cause the image E to be higher. However, the tweet embedding covers more items, so
13:07
the recall AK is higher. For item analysis, the table on the top shows the social media words on test day one, and then here below are two items I can consider as true positive
13:26
and false positive. The first item is considered as true positive because it blink high both in actual sales drink and predictive drink. The second item is false positive because it blink high in the predictive drink, but low in the actual sales drink. Both items
13:45
mentioned some words similar to the words in social media that day, like some events and then trouble investigation. However, the second item is not a popular item.
14:04
It just happened to mention the words popular in the social media that day. So how to improve this decision will be a future work. In conclusion, we investigated whether temporal background from social media and past sales can be used in predicting temporary deal
14:26
sales. We verified the hypothesis that items that are more consistent with the temporal background will have higher sales. We proposed a framework to generate temporal background and compare it with the item to be sold. Evaluation with real-world data shows that
14:44
the temporal background can actually be effective in the prediction. In the future, we plan to incorporate the findings of this into a full-scale recommendation system. That's the end of my presentation. Thank you for your attention. I welcome any questions.