Script 1205: Script Group Country Code
Purpose:
The script extracts and assigns a country code from a group name into a designated column based on specific parsing rules.
To Elaborate
The Python script is designed to parse group names and extract country codes, which are then assigned to a specific column labeled ‘Country Code’. The country code is identified as the substring appearing before the first hyphen (‘-‘) in the group name. Special handling is provided for group names starting with “BEFR” or “BENL”, where the country code is explicitly set to “BE”. If the extracted substring is not exactly two characters long, the script assigns “N/A” to indicate that a valid country code could not be determined. This functionality is useful for organizing and categorizing data based on geographic identifiers embedded within group names.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining a configurable parameter
PLACEMENT_KEY
, which is used to identify the delimiter (‘ - ‘) in the group name. - The primary data source is specified through
inputDf
, which is a dictionary entry fromdataSourceDict
.
- The script begins by defining a configurable parameter
- Function Definition:
- A function
get_country_code_from_group_name
is defined to extract the country code from a given group name. - The function handles special cases for group names starting with “BEFR” or “BENL” by returning “BE”.
- It uses a regular expression to capture the substring before the first hyphen and checks if it is two characters long to qualify as a country code.
- A function
- Data Processing:
- The script copies all rows from the input DataFrame
inputDf
tooutputDf
. - It applies the
get_country_code_from_group_name
function to each group name in the input DataFrame, storing the result in the ‘Country Code’ column ofoutputDf
.
- The script copies all rows from the input DataFrame
- Data Cleaning:
- Extra whitespace is removed from the group names in
outputDf
to ensure clean data presentation.
- Extra whitespace is removed from the group names in
- Output:
- Finally, the script prints a tableized version of the output DataFrame, displaying the processed data with the extracted country codes.
Vitals
- Script ID : 1205
- Client ID / Customer ID: 1306927811 / 60270355
- Action Type: Bulk Upload
- Item Changed: AdGroup
- Output Columns: Account, Campaign, Group, Country Code
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-06-21 11:54
- Last Updated by Grégory Pantaine on 2024-06-21 12:01
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
## name: Script - Group - Country Code
## description:
## Parse Group Name and pick out the country code into a dimension 'Country Code'.
## Country code appears before the first '-' in the group name.
##
## Copied by Grégory Pantaine
## created: 2024-06-21
########### Configurable Params - START ##########
PLACEMENT_KEY = ' - '
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP = 'Group'
RPT_COL_COUNTRY_CODE = 'Country Code'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_COUNTRY_CODE = 'Country Code'
# Function to extract country code from group name
def get_country_code_from_group_name(group_name):
# Special cases for BEFR and BENL
if group_name.startswith("BEFR") or group_name.startswith("BENL"):
return "BE"
# Regular expression pattern to match the country code before the first '-'
regex_pattern = r"^([^-]+)"
# Search for the country code using the pattern
match = re.search(regex_pattern, group_name)
if match:
country_code = match.group(1).strip()
if len(country_code) == 2:
return country_code # Return the matched country code
else:
return "N/A"
else:
return "N/A" # Return "N/A" if no match is found
# Copy all input rows to output
outputDf = inputDf.copy()
# Extract country code from each group name
outputDf[BULK_COL_COUNTRY_CODE] = inputDf[RPT_COL_GROUP].apply(get_country_code_from_group_name)
# Remove extra whitespace from group names
outputDf[RPT_COL_GROUP] = outputDf[RPT_COL_GROUP].str.strip()
# Print the tableized version of the output DataFrame
print(tableize(outputDf))
Post generated on 2025-03-11 01:25:51 GMT