Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include usage and examples of dot sourcing. #102

Open
wtjones opened this issue Nov 22, 2017 · 26 comments
Open

Include usage and examples of dot sourcing. #102

wtjones opened this issue Nov 22, 2017 · 26 comments

Comments

@wtjones
Copy link

wtjones commented Nov 22, 2017

For example: https://github.com/pester/Pester

This could lead into patterns of structuring modules or a profile folder.

@OCram85
Copy link

OCram85 commented Nov 23, 2017

Do you mean a script module like this which is liked in module manifest to dot source all other files?

$Functions = Get-ChildItem -Path $PSScriptRoot\*.ps1 -Recurse
ForEach ($Item in $Functions) {
    . $Item.FullName
}

This would be great if such a usage example could lead into a best practice repository / module structure.

@pauby
Copy link
Contributor

pauby commented Nov 23, 2017

You can also use the NestedModules key in your manifest to add all of the PS1 files using relative paths to the manifest.

I'm always wary of using the code that @OCram85 posted as it simply loads every PS1 file in a folder / subfolders. But it is a very common way of doing it.

@wtjones
Copy link
Author

wtjones commented Nov 24, 2017

I generally do it like this to target a specific folder in the module (I haven't tried nested modules):

$script:moduleRoot = Split-Path -Path $MyInvocation.MyCommand.Path
# Dot source functions
"$script:moduleRoot\functions\*.ps1" | Resolve-Path | %{. $_.ProviderPath}

I generally have a flat structure, but I could see the recurse option being useful for larger modules.

@ChrisLGardner
Copy link

I'm becoming more of a fan of using a build script to pull all the function files into a single psm1 that you bundle with a Psd1 and publish just those two as the module. For development you keep a folder structure as required, using private and public functions in their own folders if you want to. You can even write some simple code to add the Export-ModuleMember line to the end with all your public functions.

All the tests then target the psd1/psm1 and test that after its compiled. You don't run into any weird dot sourcing problems (not a common thing but it can happen depending on what is in your function files). You get to keep the same level of maintainability you have with the dot sourcing approach.

@wtjones
Copy link
Author

wtjones commented Dec 31, 2017

Wouldn't you still need to dot source while in development mode to pull in the functions?

My dev cycle tends to be like this (when not running tests), but then again most of my modules are fairly small:
Import-Module -force FooModule; Get-FooThing

@ChrisLGardner
Copy link

ChrisLGardner commented Dec 31, 2017 via email

@wtjones
Copy link
Author

wtjones commented Jan 2, 2018

That makes sense, but do you have any examples? I am always looking for ways to streamline my process.

@ChrisLGardner
Copy link

The basic script I use is this. It's a little hard coded at the end for figuring out which functions to make public (assumes Verb-Noun) but I'd only want to make those ones public anyway.

The code coverage part of Invoke-Pester is commented out as it was cluttering up my output and I didn't care too much about it outside of my CI build. I run this with PowerShellGuard so that it auto runs whenever I save a file in the repo.

[cmdletbinding()]
param (
    $SourceFolder = $pwd
)
Write-Verbose -Message "Working in $SourceFolder" -verbose
$Module = Get-ChildItem -Path $SourceFolder -Filter *.psd1 -Recurse | Select-Object -First 1

$DestinationModule = "$($Module.Directory.FullName)\$($Module.BaseName).psm1"
Write-Verbose -Message "Attempting to work with $DestinationModule" -verbose

if (Test-Path -Path $DestinationModule ) {
    Remove-Item -Path $DestinationModule -Confirm:$False -force
}

$PublicFunctions = Get-ChildItem -Path $SourceFolder -Include 'Public', 'External' -Recurse -Directory | Get-ChildItem -Include *.ps1 -File
$PrivateFunctions = Get-ChildItem -Path $SourceFolder -Include 'Private', 'Internal' -Recurse -Directory | Get-ChildItem -Include *.ps1 -File

if ($PublicFunctions -or $PrivateFunctions) {
    Write-Verbose -message "Found Private or Public functions. Will compile these into the psm1 and only export public functions."

    Foreach ($PrivateFunction in $PrivateFunctions) {
        Get-Content -Path $PrivateFunction.FullName | Add-Content -Path $DestinationModule
    }
    Write-Verbose -Message "Found $($PrivateFunctions.Count) Private functions and added them to the psm1."
}
else {
    Write-Verbose -Message "Didnt' find any Private or Public functions, will assume all functions should be made public."

    $PublicFunctions = Get-ChildItem -Path $SourceFolder -Include *.ps1 -Recurse -File
}

Foreach ($PublicFunction in $PublicFunctions) {
    Get-Content -Path $PublicFunction.FullName | Add-Content -Path $DestinationModule
}
Write-Verbose -Message "Found $($PublicFunctions.Count) Public functions and added them to the psm1."

$PublicFunctionNames = $PublicFunctions |
    Select-String -Pattern 'Function (\w+-\w+) {' -AllMatches |
    Foreach-Object {
    $_.Matches.Groups[1].Value
}
Write-Verbose -Message "Making $($PublicFunctionNames.Count) functions available via Export-ModuleMember"

"Export-ModuleMember -Function {0}" -f ($PublicFunctionNames -join ',') | Add-Content $DestinationModule

$var = Invoke-Pester -Script $SourceFolder -Show Fails #-CodeCoverage $DestinationModule -CodeCoverageOutputFile "$SourceFolder\..\$($Module.Basename)CodeCoverage.xml" -CodeCoverageOutputFileFormat JaCoCo -PassThru -Show Fails

Invoke-ScriptAnalyzer -Path $DestinationModule

@wtjones
Copy link
Author

wtjones commented Jan 3, 2018

Thanks for sharing. I was not aware of PowerShellGuard.

This is starting to make me think of Python Cookie Cutter: https://github.com/audreyr/cookiecutter. What you are demonstrating is the beginnings of the optimal template for Powershell Modules. (Handy addition to this project?)

In regards to dot sourcing: Although my method may not scale well for modules, I find it fairly useful for managing numerous non-module functions in my PS profile.

Options:

  • Relegate dot-sourcing to something like a profile practices section.
  • Have a dot-sourcing practices section, but omit most/all modules as a strong use case.
  • Consider dot-sourcing not a good practice.

@wtjones
Copy link
Author

wtjones commented Jan 3, 2018

Wouldn’t debugging and breakpoints be an issue if the functions are copied to the psm1?

@ChrisLGardner
Copy link

Wouldn’t debugging and breakpoints be an issue if the functions are copied to the psm1?

It's not really, you either run the build script once and then set the breakpoints in the psm1 or set the breakpoints in your pester tests and run those.

@wtjones
Copy link
Author

wtjones commented Jan 4, 2018

I might find that annoying, but then again, the VS Code debugger crashes PS so often that I rarely use it.

Dot sourcing is popular enough to the point that I wouldn't fully discount the usage (Pester, Pester-Format, psake, pscribo, PSDepend, PSDeploy). If there are known issues with it, they could be listed as caveats.

@Jaykul
Copy link
Member

Jaykul commented Jan 4, 2018

Dot sourcing is popular enough to the point that I wouldn't fully discount the usage

You're mistaking "what everyone does" for "what everyone should do"

The point of a Best Practices document (and in particular, the point of having discussions in the issues here) is to explain what you should and should not do, and why.

Dot-sourcing is not wrong, although combining dot-sourcing with Get-ChildItem and shipping things to other users probably is. However, there are a lot of reasons why you need to put everything in a single psm1, here are my top three:

  1. Module load performance
  2. Impact of 1 on Get-Command
  3. PowerShell classes require the classes and any functions which use them to be together in the top level PSM1 -- PowerShell will not find classes if they're not in the top-level psm1.

For me, that third point tears it, and makes me say, simply:

The best practice for Modules, is to ship a single psm1 per module.

A best practice is the simple answer that works almost all of the time.

Yes, you can do things differently if your needs are simple. In fact, I have. There are modules written by me, here in public view on GitHub, using every variation on this that you can imagine. There are even a few good reasons to make it more complicated (hiding state from the user would be one reason, variables in nested psm1 modules are harder to discover and get at). But you should not dot-source, because when your module reaches a certain size, or when you need classes, or when you're code-signing ... then you will need to change it, and changing will be harder than doing it this way the first time.

I agree with all of you that (given the state of tools in the PowerShell editing world) it's much easier to edit code when things are spread file-per-function, and I will second @ChrisLGardner that I always run pester tests against the "compiled" single-file module, just in case. I have a module that I want to publish that includes this Optimize-Module function and some tools for converting error messages to the source file and line number to open it in VSCode ... and for copying breakpoints from VSCode to a module runspace -- but I haven't written all of it yet ;-)

@wtjones
Copy link
Author

wtjones commented Jan 5, 2018

You're mistaking "what everyone does" for "what everyone should do"

I'm not saying that everyone should do it. As a module author, I look at other successful modules for structural patterns, and that is what I found.

The topic of this issue was not meant to be focused on modules, but to instead acknowledge that dot sourcing is a thing and maybe there should be best practices for when and when not to use it.

@Jaykul
Copy link
Member

Jaykul commented Jan 26, 2018

Yeah, it's clearly necessary. I can live with a section on dot-sourcing to explain why it's a bad idea ;-)

@MarkPerry24
Copy link

MarkPerry24 commented Jun 25, 2018

@Jaykul Your statement "PowerShell classes require the classes and any functions which use them to be together in the top level PSM1 -- PowerShell will not find classes if they're not in the top-level psm1." Isn't correct. If you create a folder which contains your class.ps1 and dot source that it does work. I use this quite often. Unless you're looking for bleeding edge performance I'd argue that the convenience of separate files outweighs the slightly poorer performance. I've not looked at any benchmarks but modules I've used that do, dot source are not noticeably slower and since PowerShell was designed primarily with automation in mind the user isn't there to see it or overly worry about it. Suggesting everyone "should" maintain monolithic psm1 files seems a little counter intuitive to general coding best practices. I would definitely support a move towards separate source and a "compiled" release version. This is then functionally similar to a C# project in VS. This would be fairly trivial to achieve really on a release pipeline. It's probably something I will look at playing with as it IS a good idea. Another issue you cite is psm1 file that dot sources requires code signing the ps1 files which again isn't strictly true in my experience. There are best practice arguments for code signing due to PS execution policy primarily and many people are advocating simply calling powershell.exe to bypass these issues. However it doesn't prevent the code from working as it does without dot sourcing unless you have a specific example of this issue? I am guilty of using gci to dot source out of laziness as if I add new items they are automatically included but I use a manifest to only expose publicly what I want to. I "Should" probably statically dot source but I don't see much of an advantage to it. To sign a file or lots of files is as easy as gci * -include *.ps1, *.psm1, *.psd1 | Set-Authenticodesignature (gci cert:\currentuser\my -codesigningcert | Select -First 1). Though remember to add the timestamp, it's just an example... :P Whilst there is an overhead of building and checking trust chains it's really over in a few hundred milliseconds. It happens all the time when surfing the web for instance, your web-browser is constantly checking and building trust chains even in a seemingly single page.

@Jaykul
Copy link
Member

Jaykul commented Jun 25, 2018

@Someone24github but if you dot source, those classes are basically private -- they are defined only within the file you dot-sourced them into. Which means you can't use them as the type for parameters of functions, or any public interface. That doesn't make them useless, but it does make them less useful and potentially confusing: people following that pattern will eventually try to use the classes on parameters and fail.

I did not suggest you should maintain monolithic psm1 files -- but we are suggesting you should ship monolithic psm1 files.

But let's be clear: if your software needs to be code-signed, you have to sign every script. Anything else, and your software simply isn't code-signed -- and it will fail in any environment where code-signing is required (which should be every production environment, where you should not be relying on the easily bypassed ExecutionPolicy, but using App-locker to prevent unsigned code of any sort from running).

There's no difficulty in signing lots of files (my Authenticode module makes it as simple as sign -module MyModule) but there is repeated overhead every time the module loads to check the signature --

Frankly: considering that one of the big uses for PowerShell is DSC and automated deployments, I think everyone should be more concerned about performance.

A "few hundred milliseconds" adds up fast

Since you're citing anecdotal load times, let's have story time:

In one specific implementation which I cleaned up at my current employer, there was a module that had grown up over the last decade of use into approximately 250 files. These are all code-signed, source-controlled, and shipped as a dot-source module. When I changed the build process to merged them into 5 or 6 code-signed modules --without any other changes-- it gave us a load-time improvement of over 30 seconds on our test systems. That's barely a 100ms per file improvement.

Note that by rights, it could have been longer: imagine if it was a 200ms CRL lookup for each file, and not just file IO and hash checking: 250 files taking 200ms would add 50s to your module load --and that's when you're connected to the internet! If the CRL check fails because you're not network connected yet (e.g. if the scripts are being used to harden the system, so it hasn't been connected to the internet yet), then you have a 15 second timeout ... for each check.

Anyway, to finish the story, and drive the point home: that 30 second load improvement was multiplied over only a dozen servers --which are processed serially because they are load balanced, etc.-- but the module load time affected about 30 separate software install deployments. You can do the math. That adds up to about three hours chopped off an automated deployment process.

@ChrisLGardner
Copy link

Another case of the building a single psm1 instead of dot sourcing all the ps1 files is PowerShellGet who went through this a little while ago (after having a monolithic psm1 for dev purposes for too long) and you can see the results of it in this thread: https://github.com/PowerShell/PowerShellGet/issues/240

The other one I often quote when telling people to use a build process and ship a single psm1 that's build from all the ps1 files is DbaTools, I tested it a month or two ago on my machine and the difference between their dev psm1 (which dot sources things) and their production psm1 (which is properly built up) was ~40 seconds to import the dev version and ~10 seconds to import the production one. That's a module with 400+ public commands, some huge number of private ones and a bunch of classes and stuf for good measure. And the CRL checking comes in there again, my machine had internet access so I'd hate to see what happens if I didn't and I'm importing 400+ dot sources functions that are signed.

@MarkPerry24
Copy link

@Jaykul @ChrisLGardner I'm not opposed to compiling into a single file for performance reasons on large implementations. However I don't agree it should be a one size fits all approach. Small modules will not necessarily see any tangible benefit and simply introduce unnecessary complexity. Definitely agree on the large module front though but not due to the issue cited by both of you regarding crl per file as this isn't how crl checking works. It'll get the crl once in that session and cache the crl. The larger the module the more overhead I agree but it's a judgement call rather than you should always do it this way in IMO. Module load time increases per added file but I don't think it's crl issues causing it and it's more likely a combination of reasons.

I liked story time :) though I prefer Brothers Grimm, to use dot sourced files is not going to add three hours to an automation process unless you really go out of your way to achieve that or your module needs some architecture change. It may be a monolithic psm1 as a start but as PowerShell is an interpreted language it may also be convert to C# and make cmdlets rather than functions.

The supposition is incorrect regarding classes being effectively private to the file, it is effectively scoped to the module. Private (annoyingly) simply doesn't exist in PowerShell.... If you dot source the file containing the class you get access to that class in the module. I have read some people prefer to use Import-Module against it from the psm1 or the problematic "using" statement. However you can happily dot source and use it in a function in e.g. "function myfunction { param ( [myclass]$bob ) begin{} process{} end{} }"
As working example (for brevity I didn't put the function into a separate file but ):
layout: TestModule
|_______ TestModule.psm1
|
Types
|_____ MyNumbers.ps1

content:
TestModule.psm1 =
$psroot = $PSScriptRoot
. $psroot\Types\AddNumbers.ps1

function add
{
[MyNumbers]::AddNumbers(1,2)
[myNumbers]$number = @{NumberOne = 5; NumberTwo = 10}
$number
}

MyNumbers.ps1 =
Class MyNumbers
{
[int32]$NumberOne

[int32]$NumberTwo

MyNumbers(){}

MyNumbers ([int32]$one, [int32]$two)
{
    $this.NumberOne = $one
    $this.NumberTwo = $two
}

static [int32] AddNumbers([int32]$one, [int32]$two)
{
    return $one + $two
}

}

@Jaykul
Copy link
Member

Jaykul commented Jun 26, 2018

First of all, the whole point of the Best Practices is to speak about the best way -- if there are lots of ways to do something, and they're all equally good (or we can't agree on which way is best), then we don't need to talk about it in the best practices book...

However, in this case, although there are many ways to put together and ship a module, shipping a single PSM1 is, in fact, the best way: it works when your module is small, and it scales well when your module grows. If you ship it dot-sourced to start with, you incur that performance penalty as you add more and more features, until it gets so big your users start complaining about it, and you have to finally address it. We know this to be true, because we have done it that way! Dot-sourced modules is how most of us wrote modules for years -- and we've learned about the performance problems the hard way.

@Someone24github the story I wrote down above is something that actually happened with real code at a real company. Your insistence that it's not that big a deal doesn't change the fact that it did, in fact, have that big of an impact.

@Jaykul
Copy link
Member

Jaykul commented Jun 26, 2018

As far as classes -- @Someone24github your example is using private classes: they are only used within the module, and the users don't even know they exist, and can't use them. In fact, you can't really choose to expose them to users later, because a module like this doesn't work:

TestModule.psm1

. $PSScriptRoot\Types\Thing.ps1

function Get-Thing1 {
param(
    [Thing]$Thing
)
    $Thing
}

function Get-Thing2 {
param(
    [string]$Name
)
    Get-Thing1 $Name
}

Types\Thing.ps1

class Thing { 
    $Name = "Thing1"
    Thing([string]$name) {
        $this.Name = $name
    }
}

When you import this TestModule, you cannot call Get-Thing1 -- it works from other functions inside the module, so you can call Get-Thing2 even though it calls Get-Thing1 ... but the Type can't be seen by users of the module. It's effectively private.

@MarkPerry24
Copy link

@Jaykul I've said numerous times now that I understand the point and it is a good fit and something I will use for larger projects. For small projects we'll just have to disagree...

Classes--- your original statement was "they are defined only within the file you dot-sourced them into" I was responding to that.
You want Get-Thing1 to work create a psd1:
@{
RootModule = 'Thing.psm1'
ModuleVersion = '1.0'
ScriptsToProcess = '.\Types\Thing.ps1'
}
It should now work.

@MarkPerry24
Copy link

@Jaykul They type can't be seen? Get-Thing2 returned the object and it's type. In fact in the first iteration after calling Get-Thing2 you can now call Get-Thing1 from shell without Get-Thing2. It's not a behavior I like but it is what it is... How would you want a user to use [Thing] as a type accelerator, new-object etc?

@MarkPerry24
Copy link

@Jaykul I believe you on the story :) Your needs were due to a large module as you said. I totally agree for large modules. So your proposal is in your dev branch use split files, in fact knock yourself out and in master or similar maintain a compiled psm1?

@Jaykul
Copy link
Member

Jaykul commented Jun 26, 2018

Well, I use a function to build the module from the ps1 files in source.

As far as I'm concerned, the main usefulness of real classes in PowerShell is as parameter types (to allow conversions, or to allow one function to get a bunch of values and pass it as a single parameter to another function, etc), or to implement things I need to talk to .NET classes. They're also useful for output and formatting -- but we could do this without a real "class":

[PSCustomObject]@{
PSTypeName = "Thing"
Name = "Thing2"
}

I don't ever use them the way you did in your example, just as containers for methods.

You are right, of course, that calling a function (like Get-Thing2) which outputs a type, will result in PowerShell "knowing" that type -- and that this behavior is awful 😉.

If you just put the class definition at the top of the psm1 file, you don't need to do anything more -- no arcane incantations, no helper functions to output the types and make them behave -- it just works the way it was meant to 😃

@MarkPerry24
Copy link

@Jaykul I'm slightly confused but I think we're mostly on same page minor differences aside. The usage you attribute to real power of classes is conversions or as single parameter? Both of these as you describe you already get for free in other classes or in functions. In these cases I would argue you're probably using classes for no tangible benefit. However I see a lot of merit in classes for a number of other reasons. Type checking is number one. If I say a script can only take that type then I'm sorry but that it game over if it isn't that type or I have no constructor to support it. Yes you could use a PSCustomObject but then you have to accept all of them and process them in a validatescript attribute and you want this because if you try to reassign a pscustomobject to that variable it gets re-evaluated (my favourite feature after crying blood when I first discovered this nuance) in the current function but that would quickly become annoying in each function a huge validatescript attribute per function, you can avoid this I guess in your private functions but it's a risky strategy. Another benefit is you can use some tricks to ensure you have only one copy of a class rather than instantiate a new one. You can inherit from other classes such as hashtable, your own classes etc. to extend them. On the flipside they also have their downsides nothing is private, hidden is all you have but you can still get the hidden data directly from the object. There is a ton of good stuff when using classes....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants