The git filter option isn’t well documented, but its very useful for removing sensitive information you don’t want appearing in your public repo. This post provides an example of replacing the Azure TenantID and AppID with dummy values during the git commit process for a PowerShell script.

Background

Best practice is to never hard code sensitive information in your scripts. Better strategies include:

  • Accessing an encrypted vault at runtime (Azure KeyVault, CyberArk Privileged Access etc)
  • Operator supplied information at runtime (for interactive scripts)

Having said that, there is information that isn’t “secret” but is specific to an environment that needs to be removed before making code public.

Scenario

I’ll use the PSMDE PowerShell module as an example.
This PowerShell module is used interactively to manage Defender for Endpoint from the command line. The code uses the delegated authentication of an Azure Enterprise Application to access the Defender Security API. It uses Authorization Code Flow with the following environment-specific information:

ValueFormatComment
TenantIDGUID stringSpecific to an Azure environment
ClientIDGUID stringA.K.A ApplicationID. Specific to an Azure Enterprise App.
CredentialsMultiple optionsUsername / Password supplied interactively in this case

The Connect-SecurityCenter.ps1 script makes authentication simple by including default values for the TenantID and ClientID. The administrator just calls the script and is prompted for credentials with sufficient access.

Connect-SecurityCenter.ps1 extract with default values for TenantID and ClientID (AppID):

Function Connect-SecurityCenter{
 [CmdletBinding()]
 Param(
    [parameter()]
    [ValidateNotNullOrEmpty()]
    [String]$TenantID = 'f8856e0c-f363-4583-9a2f-ef0208473d3c'
    ,
    [parameter()]
    [ValidateNotNullOrEmpty()]
    [String]$ClientID = '7c0cc95f-d722-4533-94e2-92530f29c07c'
 )
 ...

Someone downloading the public code is expected to edit the default values for their own environment.

So how do we develop this module with our own variable defaults, but avoid committing them to the public repo?

Git Filter

Git can apply filter scripts to the code during the staging step to replace text. The working directory retains the default values, but the repo code has sanitised replacement text.

A Clean filter script is applied when the code is staged to the git index. A Smudge filter script is applied at checkout - effectively the opposite direction. The steps below explain how to implement these filters.

Step 1: Enable filtering in the git global config

Update the git config to enable the filters as follows. This can be done in the per-repo git config, but I find its better as a global option…

Command format:
git config –global filter.<filter name>.<clean|smudge><path to a script | inline command>

# Commands to run
git config --global filter.gitClean.clean "~/gitclean.sh"
git config --global filter.gitClean.smudge "~/gitsmudge.sh"
  • gitClean is the freeform name for the filter. You can create multiple filters with separate scripts.
  • gitClean has a separate clean script (gitclean.sh) and smudge script (gitsmudge.sh)
  • Both scripts are in the user’s home folder (~/ is c:\users\username on Windows)

Step 2: Create the clean and smudge scripts

Even on Windows git uses bash to execute scripts. You might be able to get PowerShell working, but I found it easier just to stick with bash. The built-in sed command can be used to replace text as follows:

Command format:
sed -e “s/<pattern>/<replacement>/<flags>”

The text specified by <pattern> is replaced by the text in <replacement>. The “g” flag replaces all occurrences in the file.

The gitclean.sh script replaces the real tenant ID and application ID with dummy text and the gitsmudge.sh script reverses the replacement.

#!/usr/bin/env bash
# gitclean.sh
TenantID="f8856e0c-f363-4583-9a2f-ef0208473d3c"
AppID="7c0cc95f-d722-4533-94e2-92530f29c07c"
sed -e "s/$TenantID/00000000-0000-0000-0000-TENANTID/g" -e "s/$AppID/00000000-0000-0000-0000-APPID/g"
#!/usr/bin/env bash
# gitsmudge.sh
TenantID="f8856e0c-f363-4583-9a2f-ef0208473d3c"
AppID="7c0cc95f-d722-4533-94e2-92530f29c07c"
sed -e "s/00000000-0000-0000-0000-TENANTID/$TenantID/g" -e "s/00000000-0000-0000-0000-APPID/$AppID/g"

On Windows the files should be saved in the c:\users\<username> folder.

Watch out for unix line ending and encoding requirements - In Visual Studio Code:

  • If the status bar shows CRLF, double-click on it and change to LF
  • If the status bar shows UTF-8 with BOM, double-click on it and change to UTF-8

Step 3: Specify the file extensions that use the filter

Edit the git attributes file in the repo:

.git\info\attributes

If the file doesn’t exist, create it as a new text file with no file extension.
Add a line with <file extension> filter=<filter name> as follows:

*.ps1 filter=gitClean

Step 4: Stage the files in the index

The filter is applied at staging, but it won’t be immediately obvious that anything has happened. Add or update the file using git add:

git add public\Connect-SecurityCenter.ps1

The command will complete as normal. The file in the working directory will still have the real TenantID and AppID, but the staged file will have the replacement dummy values. Commit the staged file to the local repo and push to update to the public remote repo.

git commit -m "Update to Connect-SecurityCenter.ps1"
git push origin

Check the GitHub repo to confirm the text replacement was successful.

On subsequent updates, git diff will confirm the text replacement is successful at staging, before commit. It won’t show any differences for these lines - indicating the dummy values are in-place.

git diff --staged

Now that git filtering is in-place, you can edit and update the code without worrying that unwanted information is in the public domain.



This article was originally posted on Write-Verbose.com