Script 187: populate Auto Events
In a Nutshell
The script processes PPC campaign data to update event names based on specific patterns in group names.
To Elaborate
- The script is designed to manipulate and update a dataset related to PPC (Pay-Per-Click) campaigns.
- It focuses on extracting and transforming parts of the campaign group names to generate new event names.
- The business rules applied seem to involve:
- Identifying parts of the group name that match a certain pattern (e.g., a year followed by a dash).
- Splitting the group name and extracting specific segments to construct the new event names.
- Filtering and updating the dataset with these new event names only where changes are necessary.
Walking Through the Code
- The code uses a pandas DataFrame (
inputDf
) to store and manipulate PPC campaign data. - Temporary fields are created in the DataFrame to hold intermediate data during processing.
- A regular expression pattern is defined to match group names containing a year followed by a dash.
- The script splits group names based on the pattern and assigns parts of the split name to temporary fields.
- It then constructs new event names by combining different parts of the group name based on certain conditions (e.g., excluding ‘Brand’ or ‘Generic’).
- The final step involves updating the original event names in the output DataFrame (
outputDf
) with the newly constructed event names, but only for rows where the event name has actually changed. - The script filters out rows where no update is needed, ensuring that only relevant changes are made.
- Constants like
BULK_COL_ACCOUNT
,BULK_COL_CAMPAIGN
,BULK_COL_GROUP
, andBULK_COL_AUTOEVENTS
are used to reference specific columns in the DataFrame, indicating that these are key fields in the PPC dataset. - Multiple temporary fields (
TMP_FIELD
,TMP_FIELD2
,TMP_FIELD3
,TMP_FIELD4
) are used to facilitate the transformation process without affecting the original data until the final update is confirmed. - The script concludes by displaying the updated DataFrame, which now contains the new event names where applicable.
Vitals
- Script ID : 187
- Client ID / Customer ID: 1306924555 / 60269337
- Action Type: Bulk Upload
- Item Changed: AdGroup
- Output Columns: Account, Campaign, Group, Events
- Created by Jonathan Reichl on 2023-06-13 08:39:02
- Last Updated by Jonathan Reichl on 2023-12-06 04:01:47
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#set TMP_FIELD = split group name
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_AUTOEVENTS = 'Events'
#outputDf[BULK_COL_ACCOUNT] = "<<YOUR VALUE>>"
#outputDf[BULK_COL_CAMPAIGN] = "<<YOUR VALUE>>"
#outputDf[BULK_COL_GROUP] = "<<YOUR VALUE>>"
#outputDf[BULK_COL_AUTOEVENTS] = "<<YOUR VALUE>>"
today = datetime.datetime.now(CLIENT_TIMEZONE).date()
print(tableize(inputDf))
TMP_FIELD = BULK_COL_AUTOEVENTS + '_new'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
TMP_FIELD2 = BULK_COL_AUTOEVENTS + '_new2'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
TMP_FIELD3 = BULK_COL_AUTOEVENTS + '_new3'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
TMP_FIELD4 = BULK_COL_AUTOEVENTS + '_new4'
# blank out tmp field
inputDf[TMP_FIELD] = numpy.nan
pattern = r"\d{4}-"
regex = re.compile(pattern, re.IGNORECASE)
#set TMP_FIELD = split group name
inputDf.loc[ (inputDf[BULK_COL_GROUP].str.contains(regex)) , TMP_FIELD ] = inputDf[BULK_COL_GROUP].str.split('-').str[1]
inputDf.loc[ (inputDf[BULK_COL_GROUP].str.contains(regex)) , TMP_FIELD2 ] = inputDf[BULK_COL_GROUP].str.split('-').str[2]
#inputDf.loc[ : , TMP_FIELD3 ] = inputDf[TMP_FIELD2].str.split('_').str[0]
inputDf.loc[ (inputDf[TMP_FIELD2] != 'Brand')| (inputDf[TMP_FIELD2] != 'Generic') , TMP_FIELD3 ] = inputDf[TMP_FIELD2].str.split(r'[^a-zA-Z]+').str[0]
inputDf.loc[ (inputDf[TMP_FIELD3] != inputDf[TMP_FIELD2]) , TMP_FIELD4 ] = inputDf[TMP_FIELD]+'-'+inputDf[TMP_FIELD3]
inputDf.loc[inputDf[TMP_FIELD3].str.len() <= 1, TMP_FIELD4] = inputDf[TMP_FIELD].str.split('_').str[0]
print(tableize(inputDf))
#set output
outputDf.loc[:,BULK_COL_AUTOEVENTS] = inputDf.loc[:, TMP_FIELD4]
#Filter only updates
outputDf = outputDf[ inputDf[TMP_FIELD4].notnull() & (inputDf[BULK_COL_AUTOEVENTS] != inputDf[TMP_FIELD4]) ]
Post generated on 2024-02-22 06:05:49 GMT