Script 1603: Script Group Country Code 02
Purpose:
The script extracts the country code from a group name by identifying the segment before the first hyphen.
To Elaborate
The Python script is designed to parse group names and extract the country code, which is located before the first hyphen in the group name. This is particularly useful for organizing and categorizing data based on country codes, which can be critical for reporting and analysis in multinational operations. The script processes a dataset, applies a function to extract the country code from each group name, and outputs the modified dataset with a new column for the country code. It handles special cases and ensures that the extracted code is valid, defaulting to “N/A” if no valid code is found.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining a configurable parameter
PLACEMENT_KEY
, which is set to a hyphen (‘-‘). This key is used to identify the position of the country code in the group name. - The primary data source is specified, and the relevant columns are defined for both input and output data.
- The script begins by defining a configurable parameter
- Function Definition:
- A function
get_country_code_from_group_name
is defined to extract the country code from a given group name. - The function uses a regular expression to find the segment before the first hyphen and checks if it is a valid two-character country code.
- If a valid code is found, it is returned; otherwise, “N/A” is returned.
- A function
- Data Processing:
- The script copies all input data to an output DataFrame to preserve the original data structure.
- It applies the
get_country_code_from_group_name
function to each group name in the input data, storing the results in a new column for country codes.
- Data Cleaning and Output:
- The script removes any extra whitespace from the group names in the output DataFrame.
- Finally, it prints the output DataFrame in a table format for easy viewing and analysis.
Vitals
- Script ID : 1603
- Client ID / Customer ID: 1306927809 / 60270355
- Action Type: Bulk Upload
- Item Changed: AdGroup
- Output Columns: Account, Campaign, Group, Country Code
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2025-01-02 15:23
- Last Updated by Grégory Pantaine on 2025-01-06 16:00
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
## name: Script - Group - Country Code
## description:
## Parse Group Name and pick out the country code into a dimension 'Country Code'.
## Country code appears before the first '-' in the group name.
##
## Copied by Grégory Pantaine
## created: 2024-05-15
########### Configurable Params - START ##########
PLACEMENT_KEY = '-'
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP = 'Group'
RPT_COL_COUNTRY_CODE = 'Country Code'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_COUNTRY_CODE = 'Country Code'
# Function to extract country code from group name
def get_country_code_from_group_name(group_name):
# Special cases for BEFR and BENL
# if group_name.startswith("BEFR") or group_name.startswith("BENL"):
# return "BE"
# Regular expression pattern to match the country code before the first '-'
regex_pattern = r"^([^-]+)"
# Search for the country code using the pattern
match = re.search(regex_pattern, group_name)
if match:
country_code = match.group(1).strip()
if len(country_code) == 2:
return country_code # Return the matched country code
else:
return "N/A"
else:
return "N/A" # Return "N/A" if no match is found
# Copy all input rows to output
outputDf = inputDf.copy()
# Extract country code from each group name
outputDf[BULK_COL_COUNTRY_CODE] = inputDf[RPT_COL_GROUP].apply(get_country_code_from_group_name)
# Remove extra whitespace from group names
outputDf[RPT_COL_GROUP] = outputDf[RPT_COL_GROUP].str.strip()
# Print the tableized version of the output DataFrame
print(tableize(outputDf))
Post generated on 2025-03-11 01:25:51 GMT