Script 187: populate Auto Events

In a Nutshell

The script processes PPC campaign data to update event names based on specific patterns in group names.

To Elaborate

  • The script is designed to manipulate and update a dataset related to PPC (Pay-Per-Click) campaigns.
  • It focuses on extracting and transforming parts of the campaign group names to generate new event names.
  • The business rules applied seem to involve:
    • Identifying parts of the group name that match a certain pattern (e.g., a year followed by a dash).
    • Splitting the group name and extracting specific segments to construct the new event names.
    • Filtering and updating the dataset with these new event names only where changes are necessary.

Walking Through the Code

  • The code uses a pandas DataFrame (inputDf) to store and manipulate PPC campaign data.
  • Temporary fields are created in the DataFrame to hold intermediate data during processing.
  • A regular expression pattern is defined to match group names containing a year followed by a dash.
  • The script splits group names based on the pattern and assigns parts of the split name to temporary fields.
  • It then constructs new event names by combining different parts of the group name based on certain conditions (e.g., excluding ‘Brand’ or ‘Generic’).
  • The final step involves updating the original event names in the output DataFrame (outputDf) with the newly constructed event names, but only for rows where the event name has actually changed.
  • The script filters out rows where no update is needed, ensuring that only relevant changes are made.
  • Constants like BULK_COL_ACCOUNT, BULK_COL_CAMPAIGN, BULK_COL_GROUP, and BULK_COL_AUTOEVENTS are used to reference specific columns in the DataFrame, indicating that these are key fields in the PPC dataset.
  • Multiple temporary fields (TMP_FIELD, TMP_FIELD2, TMP_FIELD3, TMP_FIELD4) are used to facilitate the transformation process without affecting the original data until the final update is confirmed.
  • The script concludes by displaying the updated DataFrame, which now contains the new event names where applicable.

Vitals

  • Script ID : 187
  • Client ID / Customer ID: 1306924555 / 60269337
  • Action Type: Bulk Upload
  • Item Changed: AdGroup
  • Output Columns: Account, Campaign, Group, Events
  • Created by Jonathan Reichl on 2023-06-13 08:39:02
  • Last Updated by Jonathan Reichl on 2023-12-06 04:01:47
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#set TMP_FIELD = split group name 
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_AUTOEVENTS = 'Events'

#outputDf[BULK_COL_ACCOUNT] = "<<YOUR VALUE>>"
#outputDf[BULK_COL_CAMPAIGN] = "<<YOUR VALUE>>"
#outputDf[BULK_COL_GROUP] = "<<YOUR VALUE>>"
#outputDf[BULK_COL_AUTOEVENTS] = "<<YOUR VALUE>>"


today = datetime.datetime.now(CLIENT_TIMEZONE).date()
print(tableize(inputDf))

TMP_FIELD = BULK_COL_AUTOEVENTS + '_new'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
TMP_FIELD2 = BULK_COL_AUTOEVENTS + '_new2'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
TMP_FIELD3 = BULK_COL_AUTOEVENTS + '_new3'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
TMP_FIELD4 = BULK_COL_AUTOEVENTS + '_new4'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan


pattern = r"\d{4}-"
regex = re.compile(pattern, re.IGNORECASE)



#set TMP_FIELD = split group name 
inputDf.loc[ (inputDf[BULK_COL_GROUP].str.contains(regex)) , TMP_FIELD ] = inputDf[BULK_COL_GROUP].str.split('-').str[1]
inputDf.loc[ (inputDf[BULK_COL_GROUP].str.contains(regex)) , TMP_FIELD2 ] = inputDf[BULK_COL_GROUP].str.split('-').str[2]
#inputDf.loc[ : , TMP_FIELD3 ] = inputDf[TMP_FIELD2].str.split('_').str[0]
inputDf.loc[ (inputDf[TMP_FIELD2] != 'Brand')| (inputDf[TMP_FIELD2] != 'Generic') , TMP_FIELD3 ] = inputDf[TMP_FIELD2].str.split(r'[^a-zA-Z]+').str[0]


inputDf.loc[ (inputDf[TMP_FIELD3] !=  inputDf[TMP_FIELD2]) , TMP_FIELD4 ] = inputDf[TMP_FIELD]+'-'+inputDf[TMP_FIELD3]
inputDf.loc[inputDf[TMP_FIELD3].str.len() <= 1, TMP_FIELD4] = inputDf[TMP_FIELD].str.split('_').str[0]

print(tableize(inputDf))

#set output 
outputDf.loc[:,BULK_COL_AUTOEVENTS] = inputDf.loc[:, TMP_FIELD4]

#Filter only updates
outputDf = outputDf[ inputDf[TMP_FIELD4].notnull() & (inputDf[BULK_COL_AUTOEVENTS] != inputDf[TMP_FIELD4]) ]

Post generated on 2024-02-22 06:05:49 GMT

comments powered by Disqus