Script 1103: Script Group Country Code
Purpose
The script extracts the country code from a group name by identifying the segment before the first hyphen and assigns it to a ‘Country Code’ dimension.
To Elaborate
The Python script is designed to parse a group name string and extract the country code, which is defined as the segment appearing before the first hyphen (‘-‘). This extracted country code is then assigned to a new dimension labeled ‘Country Code’. The script includes special handling for group names starting with “BEFR” or “BENL”, where the country code is explicitly set to “BE”. If the extracted segment is not a valid two-character country code, the script assigns “N/A” to indicate that no valid country code was found. The script processes each group name in the input data, ensuring that any extra whitespace is removed from the group names before outputting the results.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining a configurable parameter
PLACEMENT_KEY
, which is set to the hyphen (‘-‘). This key is used to identify the position in the group name where the country code ends. - The primary data source is specified as
inputDf
, which is retrieved from a dictionarydataSourceDict
.
- The script begins by defining a configurable parameter
- Function Definition:
- A function
get_country_code_from_group_name
is defined to extract the country code from a given group name. - The function checks for special cases where the group name starts with “BEFR” or “BENL”, returning “BE” for these cases.
- For other group names, a regular expression is used to find the segment before the first hyphen. If this segment is a two-character string, it is returned as the country code; otherwise, “N/A” is returned.
- A function
- Data Processing:
- The script copies all rows from the input DataFrame
inputDf
to a new DataFrameoutputDf
. - It applies the
get_country_code_from_group_name
function to each group name in the input data to populate the ‘Country Code’ column in the output DataFrame. - Extra whitespace is removed from the group names in the output DataFrame.
- The script copies all rows from the input DataFrame
- Output:
- Finally, the script prints a tableized version of the output DataFrame, displaying the processed data with the extracted country codes.
Vitals
- Script ID : 1103
- Client ID / Customer ID: 1306927809 / 60270355
- Action Type: Bulk Upload
- Item Changed: AdGroup
- Output Columns: Account, Campaign, Group, Country Code
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-05-15 14:16
- Last Updated by Grégory Pantaine on 2024-05-15 14:37
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
## name: Script - Group - Country Code
## description:
## Parse Group Name and pick out the country code into a dimension 'Country Code'.
## Country code appears before the first '-' in the group name.
##
## Copied by Grégory Pantaine
## created: 2024-05-15
########### Configurable Params - START ##########
PLACEMENT_KEY = '-'
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP = 'Group'
RPT_COL_COUNTRY_CODE = 'Country Code'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_COUNTRY_CODE = 'Country Code'
# Function to extract country code from group name
def get_country_code_from_group_name(group_name):
# Special cases for BEFR and BENL
if group_name.startswith("BEFR") or group_name.startswith("BENL"):
return "BE"
# Regular expression pattern to match the country code before the first '-'
regex_pattern = r"^([^-]+)"
# Search for the country code using the pattern
match = re.search(regex_pattern, group_name)
if match:
country_code = match.group(1).strip()
if len(country_code) == 2:
return country_code # Return the matched country code
else:
return "N/A"
else:
return "N/A" # Return "N/A" if no match is found
# Copy all input rows to output
outputDf = inputDf.copy()
# Extract country code from each group name
outputDf[BULK_COL_COUNTRY_CODE] = inputDf[RPT_COL_GROUP].apply(get_country_code_from_group_name)
# Remove extra whitespace from group names
outputDf[RPT_COL_GROUP] = outputDf[RPT_COL_GROUP].str.strip()
# Print the tableized version of the output DataFrame
print(tableize(outputDf))
Post generated on 2024-11-27 06:58:46 GMT