have been working closely with mobile team

  1. learned hub uep new concepts.
  2. designed and definied initial UEP new add tags standard for each kind of actions send/serve/impression,click,showdialog,carouselitemclick,dismiss.
  3. tested all the stages of sample data, hand made->preprod->prod->1%->5%->50%->100% ramp, reported many issues on data elements.
  4. meet whole project timeline, delivered on committed time.

have been communicating with downstream Analytics and Science team

  1. introduced hub uep new concepts and confirm requirments.
  2. follow up questions with mobile team and get back to downstream.
  3. explained detailed hub uep data layout of DSS tables for user to testing.
  4. shared query methods on downstream use cases.

dev highlights

onboarded the new actions to table

  1. serve: when user clicked bellIcon or Recommended tab, or pulled the page to refresh etc, it would drop a new serve and the message content(items) would be updated based on the newest recommended reult, accordingly the message content in send is not up-to-date.
  2. merch_impression: the impression on pop-up carousel.
  3. pl_click: the click on the item of carousel.
  4. impression: already exist but we extended impressions under Recommended tab for iOS.

adapt new join key for user behaviors

we proposed with mobile team to pass a unique key of serve in all the events except for send, so that we use it to link user behaviors with serve when enriching columns to get the reall detailed message or recommendated item informations.

  1. serve join back to send by unique key of send.
  2. impression/click/merch_impression/pl_click join back to serve by unique key of serve.

enriched and unified verticl columns for all the actions, make it simple and efficient for downstream to use.

vertically speacking, as we and mobile team designed and defined, they only tracked data that was as non-redundant as possible, for example:

  1. SEND: because send has not up-to-date message content, only pass message id, placement id and recommendation id in message list to do real time message id analysis, and for the information except for message content, send is the single source of truth and will be passed to send.
  2. SERVE: serve has the up-to-date message content, so for message content, serve is the single source of truth and it will be passed to serve. for the other informations, only pass the basic ids such as tracking id, serve tracking id, user id, canvas id etc.
  3. IMPRESSION, CLICK, MERCH_IMPRESSION, PL_CLICK:
  • 1 impression could have mutiple notifications, previously, the tracking was passed like below, and we need to explode both of them and link it by the postions to get 3 rows. But when extending tags, it will need be exloded n times which will increase lots of computing resources that need to be consumed, so we proposed a new way to mobile team to pass tags, that makes we only need to explode once no matter how many hub uep tags be added.
before:

!ni_nids:nid1,nid2,nid3
!ni_nts:event_type1,event_type2,event_type3
after:

[
    {
        "eventtype": "",
        "mesgId": "",
        "serve.tracking.id": "",
        "annotation.cnv.id": "",
        "isUEP": "",
        "tracking.id": ""
    },
    {
        "eventtype": "",
        "mesgId": "",
        "serve.tracking.id": "",
        "annotation.cnv.id": "",
        "isUEP": "",
        "tracking.id": ""
    }
]
  • for the users session infomations and ids of the user actually impressed, clicked messages and recommended items, IMPRESSION, CLICK, MERCH_IMPRESSION, PL_CLICK is the single source of truth and will be passed to them.

but the enriched and unified verticl columns is much more preferred for our downstreams to do analysis or training models, so we did below works:

  1. exploded impressions to each notifications 1 row.
  2. serve joinned back with in recent 90 days send to enrich all the informations that send is single source of truth, for example: chnl_trckng_id, sent_event_type_desc, chnl_site_id and so on.
  3. user behaviors joinned back with serve by serve unique keys and message id, recommended item id in recent 90 days serve to enrich all the informations that send and serve is single source of truth, for example: message details, recommended item details and so on.
  4. defined and stored unified unique key and send, serve join key for all the actions, user can easily calculate count or conbine reaction chain by the keys.

exploded data into three horizontal layers

horizontally speacking, as UEP data born with 3 layers, users of course need to do analysis on different layers, we also need to explode the flat source data into 3 layers then they can directly to use.

  1. exploded send and serve to canvas level, message level and recommend level. all the level shares the higher level informations.
  2. for impression, click, merch_impression, pl_click, depends on the lowest level of the action, we generated the same and higher level data, for example, the impression happens on a message, then it has canvas level and message level data, the pl_click happens on an item(user clicked an item), then it has canvas level, message level and recommend level data.

But don’t worry, we only save the lowest level of the underlying data. so we got the balance of saving storage resources and creating more usable redundant data.