Script 1205: Script Group Country Code
Purpose
The script extracts and assigns a country code from a group name into a designated ‘Country Code’ dimension.
To Elaborate
The Python script is designed to parse group names and extract country codes, which are then assigned to a ‘Country Code’ dimension. The country code is identified as the substring that appears before the first hyphen (‘-‘) in the group name. Special cases are handled for group names starting with “BEFR” or “BENL”, which are directly mapped to the country code “BE”. If the extracted substring is not exactly two characters long, the script assigns “N/A” as the country code. This functionality is crucial for organizing and categorizing data based on country codes, which can be used for reporting and analysis purposes.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining a configurable parameter
PLACEMENT_KEY
, which is set to ‘ - ‘. This key is used to identify the delimiter in the group names. - The primary data source is specified as
inputDf
, which is retrieved from a dictionarydataSourceDict
.
- The script begins by defining a configurable parameter
- Function Definition:
- A function
get_country_code_from_group_name
is defined to extract the country code from a given group name. - The function checks for special cases where the group name starts with “BEFR” or “BENL”, returning “BE” for these cases.
- For other group names, a regular expression is used to extract the substring before the first hyphen. If the extracted substring is two characters long, it is returned as the country code; otherwise, “N/A” is returned.
- A function
- Data Processing:
- The script copies all rows from the input DataFrame
inputDf
tooutputDf
. - It applies the
get_country_code_from_group_name
function to each group name in the input DataFrame to populate the ‘Country Code’ column in the output DataFrame. - Extra whitespace is removed from the group names in the output DataFrame.
- The script copies all rows from the input DataFrame
- Output:
- The script concludes by printing a tableized version of the output DataFrame, which includes the newly assigned country codes.
Vitals
- Script ID : 1205
- Client ID / Customer ID: 1306927811 / 60270355
- Action Type: Bulk Upload
- Item Changed: AdGroup
- Output Columns: Account, Campaign, Group, Country Code
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-06-21 11:54
- Last Updated by Grégory Pantaine on 2024-06-21 12:01
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
## name: Script - Group - Country Code
## description:
## Parse Group Name and pick out the country code into a dimension 'Country Code'.
## Country code appears before the first '-' in the group name.
##
## Copied by Grégory Pantaine
## created: 2024-06-21
########### Configurable Params - START ##########
PLACEMENT_KEY = ' - '
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP = 'Group'
RPT_COL_COUNTRY_CODE = 'Country Code'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_COUNTRY_CODE = 'Country Code'
# Function to extract country code from group name
def get_country_code_from_group_name(group_name):
# Special cases for BEFR and BENL
if group_name.startswith("BEFR") or group_name.startswith("BENL"):
return "BE"
# Regular expression pattern to match the country code before the first '-'
regex_pattern = r"^([^-]+)"
# Search for the country code using the pattern
match = re.search(regex_pattern, group_name)
if match:
country_code = match.group(1).strip()
if len(country_code) == 2:
return country_code # Return the matched country code
else:
return "N/A"
else:
return "N/A" # Return "N/A" if no match is found
# Copy all input rows to output
outputDf = inputDf.copy()
# Extract country code from each group name
outputDf[BULK_COL_COUNTRY_CODE] = inputDf[RPT_COL_GROUP].apply(get_country_code_from_group_name)
# Remove extra whitespace from group names
outputDf[RPT_COL_GROUP] = outputDf[RPT_COL_GROUP].str.strip()
# Print the tableized version of the output DataFrame
print(tableize(outputDf))
Post generated on 2024-11-27 06:58:46 GMT