[{"data":1,"prerenderedAt":4424},["ShallowReactive",2],{"navigation":3,"search":46,"blog-shelly-mcp-server-home-assistant-migration":601,"translation-shelly-mcp-server-home-assistant-migration":1009,"related-shelly-mcp-server-home-assistant-migration":1031},[4],{"title":5,"path":6,"stem":7,"children":8,"page":45},"Blog","/blog","blog",[9,13,17,21,25,29,33,37,41],{"title":10,"path":11,"stem":12},"Azure Privileged Identity Management as code","/blog/azure-privileged-identity-management-as-code","blog/azure-privileged-identity-management-as-code",{"title":14,"path":15,"stem":16},"Branch Manager: A Web UI for Cleaning Up Stale Azure DevOps Branches","/blog/branch-manager-azure-devops","blog/branch-manager-azure-devops",{"title":18,"path":19,"stem":20},"Azure Conditional Role Assignments with Bicep!","/blog/conditional-role-assignments","blog/conditional-role-assignments",{"title":22,"path":23,"stem":24},"Centralizing Password Policy Management in Multi-Tenant Entra ID Environments","/blog/entraid-banned-password-list","blog/entraid-banned-password-list",{"title":26,"path":27,"stem":28},"From Hugo to Nuxt: Why I Switched to Vibe Code My Blog","/blog/from-hugo-to-nuxt-vibe-coding","blog/from-hugo-to-nuxt-vibe-coding",{"title":30,"path":31,"stem":32},"Intune & Ubuntu 24.04","/blog/intune-ubuntu-24-04","blog/intune-ubuntu-24-04",{"title":34,"path":35,"stem":36},"PIM + Conditional Role Assignments: Secure Autonomy for Azure Landing Zones","/blog/pim-conditional-role-assignments","blog/pim-conditional-role-assignments",{"title":38,"path":39,"stem":40},"Why I Built a Shelly MCP Server (So I Could Migrate to Home Assistant)","/blog/shelly-mcp-server-home-assistant-migration","blog/shelly-mcp-server-home-assistant-migration",{"title":42,"path":43,"stem":44},"My Ultimate Self-Hosted AI Chat Stack","/blog/ultimate-selfhosted-ai-chat","blog/ultimate-selfhosted-ai-chat",false,[47,51,57,62,67,72,77,82,87,93,98,104,109,114,119,122,127,132,137,142,147,152,157,162,167,170,175,180,184,189,194,199,204,209,214,219,224,229,234,237,242,246,251,255,260,265,270,275,280,285,290,295,300,305,309,312,317,322,327,332,337,342,345,350,355,360,365,370,375,380,385,390,395,400,404,407,412,417,422,427,432,437,442,445,450,454,458,463,468,473,478,483,488,493,498,501,505,510,514,519,524,529,534,539,544,549,554,559,564,568,573,578,581,586,591,596],{"id":11,"title":10,"titles":48,"content":49,"level":50},[],"Configure PIM Eligible Role Assignments on Azure subscriptions using the ARM API in PowerShell, including role policies, approvers, and eligible role creation. In this guide, we will delve into the intricacies of configuring Privileged Identity Management (PIM) Eligible Role Assignments on Azure subscriptions using the ARM API in PowerShell. As seasoned professionals, we recognize that leveraging PIM in Azure is a strategic imperative. However, as DevOps Engineers, we also acknowledge the challenges posed by incorporating Eligible role assignments into deployments. In this post I will expose all the intricacies concerning this piece of automation. Microsoft provides Microsoft Graph cmdlets for Entra ID PIM, but for Azure PIM Role Assignments you must use the Azure Resource Manager (ARM) API. Before we dive into the details, I want to give a shout-out to my colleague Bjorn Peters. He did the refactoring of the JSON-body into a PowerShell object so it's easier to manipulate the code. This provides with a cleaner looking script. Also check out his blog for interesting articles about Azure, DevOps and automation.",1,{"id":52,"title":53,"titles":54,"content":55,"level":56},"/blog/azure-privileged-identity-management-as-code#arm-api","ARM API",[10],"As mentioned, Microsoft provides us with the Azure Resource Manager (ARM) API for configuring PIM assignments. While this API offers unparalleled flexibility, deciphering the precise configuration details can sometimes be like unraveling a labyrinth. In this blogpost I will provide code snippets, making automation of PIM eligible roles in your environment a breeze. As mentioned before, we will do everything in PowerShell including the body. Hopefully this will provide you with more insights on how to use the code snippets in your automation.",2,{"id":58,"title":59,"titles":60,"content":61,"level":56},"/blog/azure-privileged-identity-management-as-code#the-requirements","The requirements",[10],"An IDE (such as Visual Studio Code)Proficiency in PowerShell\nAzure PowerShell moduleMicrosoft Graph moduleEntra ID Premium license for PIMoptional: Custom role definition",{"id":63,"title":64,"titles":65,"content":66,"level":56},"/blog/azure-privileged-identity-management-as-code#configuration-steps","Configuration steps",[10],"PIM configuration exists of two steps: Define Role Settings: These settings determine when role activation occurs. Think of them as your compass. Create Eligible Role Assignments: This step associates roles with users or groups, allowing temporary permission elevation using PIM.",{"id":68,"title":69,"titles":70,"content":71,"level":56},"/blog/azure-privileged-identity-management-as-code#custom-role-definitions","Custom Role Definitions",[10],"Scope Matters: When creating custom role definitions, consider the scope. If you intend to target multiple subscriptions, create the custom role definition at the highest possible level. Why? Because each role definition has its own unique GUID. If you create a new GUID for every subscription, your automation complexity increases significantly. Example Scenario: Imagine an enterprise aiming to restrict developer access. They define a custom role called Developer with GUID 3a9c9b30-bb02-43c2-a487-0d5aff050fec. Now, they want to assign this custom role via PIM to different security groups across various subscriptions. In this case, it's prudent to create the role definition at a higher level, ensuring consistent GUIDs for role assignments across different levels. Remember these principles as you want to use custom role definitions within PIM. It will streamline your development processes. Visualization of PIM a custom role definition:",{"id":73,"title":74,"titles":75,"content":76,"level":56},"/blog/azure-privileged-identity-management-as-code#tasks","Tasks",[10],"In this section we are going to execute the following tasks: #Task1.Connect to the environment2.Connect with MG Graph. Used for the creation of groups3.Create 2 new security groups. 1 for PIM requests, 1 for approval of requests4.Create a basic function to obtain headers. Used for making API calls5.Store recurring values in objects6.Update role policy with a custom role settings7.Assign the eligible role In the end we are also going to test our setup.",{"id":78,"title":79,"titles":80,"content":81,"level":56},"/blog/azure-privileged-identity-management-as-code#getting-started","Getting started",[10],"Connect with your Azure environment Connect-AzAccount to get started.Connect with MG Graph with at least Group read/write permissions: Connect-MgGraph -Scopes \"Group.ReadWrite.All\" Create a new Security Group that will be assigned the PIM Eligible role: $pimRequestorGroup = New-MgGroup -DisplayName 'pim-requestor-sg' -MailEnabled:$False  -MailNickName 'pim-test-sg' -SecurityEnabled\n$pimApproverGroup = New-MgGroup -DisplayName 'pim-approver-sg' -MailEnabled:$False  -MailNickName 'pim-test-sg' -SecurityEnabled Obtain headers. To be able to execute calls to the ARM API we need to obtain the correct headers. We will wrap this in a function for easier use throughout the codebase: Function Get-Headers {\n    Param (\n        [Parameter(Mandatory)]\n        [Array]$Context\n    )\n    $azProfile = [Microsoft.Azure.Commands.Common.Authentication.Abstractions.AzureRmProfileProvider]::Instance.Profile\n    $profileClient = New-Object -TypeName Microsoft.Azure.Commands.ResourceManager.Common.RMProfileClient -ArgumentList ($azProfile)\n    $token = $profileClient.AcquireAccessToken($Context.Subscription.TenantId)\n    $authHeader = @{\n        'Content-Type'  = 'application/json'\n        'Authorization' = 'Bearer ' + $token.AccessToken\n    }\n    return $authHeader\n} If we call the function and save the output in an object we can reuse the headers with every API call, like so: $headers = Get-Headers -Context (Get-AzContext) Store reusable values in objects to use later and switch to the correct context. Find the role definition IDs here. $subscription = Get-AzSubscription -SubscriptionId 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' # > replace this with your own subscription ID\n# Switch to the target subscription\nSet-AzContext -Subscription $subscription\n\n$eligibleAssignmentDetails = [PSCustomObject]@{\n    Id               = $pimRequestorGroup.Id # Here we enter the pimRequestorGroup ID set earlier\n    DisplayName      = $pimRequestorGroup.DisplayName # Here we enter the pimGroup DisplayName set earlier\n    EligibleRole     = 'Contributor'\n}\n\n$contributorRoleId = 'b24988ac-6180-42a0-ab88-20f7382dd24c' # This is the targeted role Update the role policy so it requires an approval on activation. # First, get the current role policy. We do this because you are only allowed to update role policies:\n$getRolePolicyUri = \"https://management.azure.com/providers/Microsoft.Subscription/subscriptions/{0}/providers/Microsoft.Authorization/roleManagementPolicies?api-version=2020-10-01&`$filter=roleDefinitionId%20eq%20'subscriptions/{0}/providers/Microsoft.Authorization/roleDefinitions/{1}'\" -f $subscription.Id, $contributorRoleId\n$rolePolicy = (Invoke-RestMethod -Uri $getRolePolicyUri -Method Get -Headers $headers).value Now that we obtained the role policy, we need to create a body to update the policy to our liking. For reference checkout the docs. The body in PowerShell gives us more flexibility if we want to loop through multiple roles/groups/approvers etc. Here goes: # Assemble body for this request\n$body = [PSCustomObject]@{\n    properties = [PSCustomObject]@{\n        rules = @(\n            # Enter the basics\n            [PSCustomObject]@{\n                isExpirationRequired = $true\n                maximumDuration      = 'PT8H' #Role can be activated for a maximum of 8 hours\n                id                   = 'Expiration_EndUser_Assignment'\n                ruleType             = 'RoleManagementPolicyExpirationRule'\n                target               = @{\n                    caller     = 'EndUser'\n                    operations = @(\n                        'All'\n                    )\n                }\n                level                = 'Assignment'\n            },\n            [PSCustomObject]@{\n                enabledRules = @(\n                    'Justification', # Requires the user to add a justification in their request\n                    'MultiFactorAuthentication' # Requires MFA authentication for the request\n                )\n                id           = 'Enablement_EndUser_Assignment'\n                ruleType     = 'RoleManagementPolicyEnablementRule'\n                target       = @{\n                    caller     = 'EndUser'\n                    operations = @(\n                        'All'\n                    )\n                    level      = 'Assignment'\n                }\n            },\n            [PSCustomObject]@{\n                isExpirationRequired = $false # Makes the role permanently eligible\n                maximumDuration      = 'P365D' # Maximum duration of eligible role assignment\n                id                   = 'Expiration_Admin_Eligibility'\n                ruleType             = 'RoleManagementPolicyExpirationRule'\n                target               = @{\n                    caller     = 'Admin'\n                    operations = @(\n                        'All'\n                    )\n                    level      = 'Eligibility'\n                }\n            },\n            # The next section adds Approvers to the body object. If you don't want/need approvers you can leave this part out of your script:\n            [PSCustomObject]@{\n                setting  = [PSCustomObject]@{\n                    isApprovalRequired               = $true # Makes approval required for the request on this role\n                    isApprovalRequiredForExtension   = $false\n                    isRequestorJustificationRequired = $true\n                    approvalMode                     = 'SingleStage'\n                    approvalStages                   = @(\n                        @{\n                            approvalStageTimeOutInDays      = 1\n                            isApproverJustificationRequired = $true\n                            escalationTimeInMinutes         = 0\n                            isEscalationEnabled             = $false\n                            primaryApprovers                = @(\n                                [PSCustomObject]@{\n                                    id          = $pimApproverGroup.Id # Reference to the Security Group that was created earlier\n                                    description = $null\n                                    isBackup    = $false\n                                    userType    = \"Group\"\n                                }\n                            )\n                        }\n                    )\n                }\n                id       = 'Approval_EndUser_Assignment'\n                ruleType = 'RoleManagementPolicyApprovalRule'\n                target   = @{\n                    caller     = 'EndUser'\n                    operations = @(\n                        'All'\n                    )\n                    level      = 'Assignment'\n                }\n            }\n        )\n    }\n} We have created the body, now it's time to update the role policy! First we construct the URI by inserting the ID we got from obtaining the role policy. Secondly we update the role policy by calling the API with a PATCH method. If all goes well the role policy should be updated. $patchRolePolicyUri = \"https://management.azure.com{0}?api-version=2020-10-01\" -f $rolePolicy.id\n$patchPolicyRequest = Invoke-RestMethod -Uri $patchRolePolicyUri -Method Patch -Headers $headers -Body ($body | ConvertTo-Json -Depth 10) Before: After: Assign the Eligible role to the pimRequestorGroup Security Group: # Create the Eligible role with a custom GUID\n# Create body\n$body = @{\n    Properties = @{\n        RoleDefinitionID = \"/subscriptions/$Subscription.Id/providers/Microsoft.Authorization/roleDefinitions/$contributorRoleId\"\n        PrincipalId      = $pimRequestorGroup.Id\n        RequestType      = 'AdminAssign'\n        ScheduleInfo     = @{\n            Expiration = @{\n                Type = 'NoExpiration'\n            }\n        }\n    }\n}\n$guid = [guid]::NewGuid()\n# Construct Uri with subscription Id and new GUID\n$createEligibleRoleUri = \"https://management.azure.com/providers/Microsoft.Subscription/subscriptions/{0}/providers/Microsoft.Authorization/roleEligibilityScheduleRequests/{1}?api-version=2020-10-01\" -f $Subscription.Id, $guid\n\n# Call the API with PUT to assign the role to the targeted Security Group\nInvoke-RestMethod -Uri $createEligibleRoleUri -Method Put -Headers $headers -Body ($body | ConvertTo-Json -Depth 10) Result:",{"id":83,"title":84,"titles":85,"content":86,"level":56},"/blog/azure-privileged-identity-management-as-code#testing","Testing!",[10],"All is now in place to test the setup. We followed the steps to create 2 security groups, update a role policy, assign the eligible role and finally we need to test this configuration:",{"id":88,"title":89,"titles":90,"content":91,"level":92},"/blog/azure-privileged-identity-management-as-code#group-membership","Group Membership",[10,84],"Add members to the created groups: Remember, users cannot approve their own requests. pimRequestorGroup: Users who can request activation of the Contributor role. pimApprovalGroup: Approvers responsible for granting or denying requests.",3,{"id":94,"title":95,"titles":96,"content":97,"level":92},"/blog/azure-privileged-identity-management-as-code#test-the-workflow","Test the workflow",[10,84],"Test the setup with the separate accounts:",{"id":99,"title":100,"titles":101,"content":102,"level":103},"/blog/azure-privileged-identity-management-as-code#requestor","Requestor",[10,84,95],"Make a PIM request: Fill in the justification and hit 'Submit':",4,{"id":105,"title":106,"titles":107,"content":108,"level":103},"/blog/azure-privileged-identity-management-as-code#approver","Approver",[10,84,95],"As an approver, start the approval of the PIM request by selecting the request and clicking 'Approve': Finally, check the request and if it meets requirements approve it with a justification:",{"id":110,"title":111,"titles":112,"content":113,"level":103},"/blog/azure-privileged-identity-management-as-code#validate","Validate",[10,84,95],"Validate if the requesting user has an active role assignment under 'My Roles' in Privileged Identity Management:",{"id":115,"title":116,"titles":117,"content":118,"level":56},"/blog/azure-privileged-identity-management-as-code#conclusion","Conclusion",[10],"That concludes this blog about configuring PIM via the ARM API with PowerShell. We have successfully: ✅ Created 2 new security groups. 1 for PIM requests, 1 for approval of requests✅ Wrote a basic function to obtain headers that we used for making API calls✅ Updated a role policy with custom role settings✅ Assigned the eligible role to our Security Group✅ Successfully tested the workflow 🔐 With PIM now in place, your organization gains precise control over permissions, enhancing security and compliance. Hopefully this post provided you with some guidance on how to automate PIM in your environment. Good luck and feel free to reach out if you have any questions! 🚀 html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}html pre.shiki code .so5gQ, html code.shiki .so5gQ{--shiki-light:#D73A49;--shiki-default:#F97583;--shiki-dark:#F97583}html pre.shiki code .slsVL, html code.shiki .slsVL{--shiki-light:#24292E;--shiki-default:#E1E4E8;--shiki-dark:#E1E4E8}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .shcOC, html code.shiki .shcOC{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0}html pre.shiki code .sQHwn, html code.shiki .sQHwn{--shiki-light:#E36209;--shiki-default:#FFAB70;--shiki-dark:#FFAB70}html pre.shiki code .sCsY4, html code.shiki .sCsY4{--shiki-light:#6A737D;--shiki-default:#6A737D;--shiki-dark:#6A737D}",{"id":15,"title":14,"titles":120,"content":121,"level":50},[],"I built a self-hosted web tool to filter, review, and bulk-delete stale branches across all repositories in an Azure DevOps project, because the portal was never designed for this.",{"id":123,"title":124,"titles":125,"content":126,"level":50},"/blog/branch-manager-azure-devops#introduction","Introduction",[],"Every team I have worked with has the same problem at some point. You open Azure DevOps, navigate to a repository, and there are 200 branches listed. Half of them are from features that shipped two years ago. A handful are from developers who left the company. A few have names like test-fix-final-v3 and nobody knows what they were for. Some commit message are cryptic as well. What would xyz or * mean? The Azure DevOps portal is excellent for many things. Branch cleanup is not one of them. You can delete branches one at a time from the repository view. A tedious process if you need vigorous cleaning. There is no easy way to filter branches by age across all repos in a project, select a batch, and remove them in one go. If you are managing more than a handful of repositories, the manual process gets old quickly. I kept meaning to write a script for it. I never quite did. Then I decided to build something slightly more permanent.",{"id":128,"title":129,"titles":130,"content":131,"level":56},"/blog/branch-manager-azure-devops#the-problem-with-branch-clutter","The Problem with Branch Clutter",[124],"Stale branches are not just an aesthetic issue. They create real noise. When a developer runs git branch -r or opens the branch selector in a PR, they are scrolling past dozens of dead ends. It slows down onboarding, because new team members cannot tell which branches are active and which are relics. It complicates repository hygiene at scale, especially when you have tens of repositories in a project. The other problem is safety. You do not want to bulk-delete branches without knowing what you are removing. Some branches have active pipelines. Some protect long-running release tracks. Any bulk cleanup tool needs to handle that clearly.",{"id":133,"title":134,"titles":135,"content":136,"level":56},"/blog/branch-manager-azure-devops#presenting","Presenting...",[124],"Branch Manager! And even in dark mode! Branch Manager is a self-hosted web application. You run it locally or host it on Azure App Service, point it at your Azure DevOps organization, sign in, and get a filterable table of every branch across all repositories in a project. From there you can: Filter by repository, branch name, and age so you can target feature/ branches older than 90 days, for exampleSort by last commit date or authorSee who last touched a branch and what the last commit message wasProtect branches automatically: any branch with an Azure DevOps policy attached to it is highlighted and locked from deletion, so you cannot accidentally remove a protected default branchAdd custom protection patterns, useful for protecting release/, hotfix/, or any prefix your team usesSelect and delete in bulk, with a confirmation dialog that shows you exactly what is about to go",{"id":138,"title":139,"titles":140,"content":141,"level":56},"/blog/branch-manager-azure-devops#authentication-two-modes","Authentication: Two Modes",[124],"Branch Manager supports two ways to authenticate against Azure DevOps. The first is a Personal Access Token. This is the quickest option if you are running it for yourself. No app registration needed. You enter your organization name and a PAT with Code.ReadWrite permissions, and you are in. The second is Microsoft Entra ID. This is the recommended option if you want to host Branch Manager for a team. You register a single-page application in Entra, grant it the user_impersonation permission on Azure DevOps, and your colleagues can sign in with their work account through the standard Microsoft login flow. This prevents the use of shared secrets and avoids using PATs altogether. Because everyone signs in with their own account you have an audit trail as well. One important note: Entra ID authentication for Azure DevOps requires a work or school account. Personal Microsoft accounts do not work here. That is a Microsoft restriction, not something Branch Manager can change.",{"id":143,"title":144,"titles":145,"content":146,"level":56},"/blog/branch-manager-azure-devops#how-it-was-built","How It Was Built",[124],"I must admin: this is a vibe coding project. The first version was a PowerShell script and although it worked -barely- it was 335 lines of something I did not want to maintain. So I let Github Copilot rebuilt it as a proper Node.js web app. The backend is Express. It proxies requests between the browser and the Azure DevOps REST API and handles authentication, rate limiting, and the branch lookup and delete operations. The frontend is plain HTML, CSS, and vanilla JavaScript without a framework. There is no build step and no bundler on the client side because I wanted a lightweight application. It is simple a new GUI for the DevOps API.",{"id":148,"title":149,"titles":150,"content":151,"level":92},"/blog/branch-manager-azure-devops#lessons-learned","Lessons learned",[124,144],"That sounds nice and all, but my git history tells a different story. Let me share my 6 biggest bumps in the road: I had to rewrite the Authentication part twice. The first attempt used MSAL Node on the server side, which meant managing the OAuth code flow server-side and dealing with session state. How it worked? I don't know because I yolo'd Copilot to do it. Soon I discovered iot worked in theory but added too much complexity for a personal tool. I scrapped it and started over with msal-browser, which acquires the Entra ID access token entirely in the browser using PKCE. The server never sees a client secret and never stores a token. Much simpler. And with examples!Azure DevOps does not return 401 errors when a token is rejected. It returns a 302 redirect to a sign-in page. That sounds like a minor detail but it completely changes how you detect auth failures. A normal response.ok check passes on a 302. You get back an HTML login page instead of JSON and the error surfaces somewhere downstream in a confusing way. I had to add explicit handling for all redirect status codes and map them to a useful error message.Helmet's Content Security Policy blocked MSAL's CDN. Helmet ships with a default CSP that locks down most external script sources. MSAL Browser loads from alcdn.msauth.net, makes token requests to login.microsoftonline.com, and needs those origins in scriptSrc and connectSrc respectively. None of those are in Helmet's defaults. Easy to fix once you understand what is happening, but the browser console errors were not immediately obvious about which policy rule was blocking what.Helmet's crossOriginOpenerPolicy breaks popup window communication. This one took longer. The default value same-origin prevents the opener page from reading the popup's location after it navigates. That is exactly the mechanism MSAL popup flow depends on. Setting it to same-origin-allow-popups fixed it, but it is not a setting you would think to check first.Tokens were appearing in request logs. The Express request logger I added for troubleshooting was faithfully printing every URL, including OAuth redirects that carry authorization codes and access tokens as query parameters. I added a sanitization step that redacts those parameters before logging. It is a small thing but it matters if logs end up in any kind of monitoring system.The Azure DevOps REST API surface for branches is fairly large. The refs endpoint, the commit details endpoint, the branch stats endpoint, and the batch delete operation all behave slightly differently and the documentation has some gaps. Copilot was genuinely useful here. It could reason about the response shapes and suggest the right request format for things like the batch delete, which expects an array of ref update objects with newObjectId set to forty zeros to signal deletion. That is not something I would have guessed, but Copilot brought me the answers.",{"id":153,"title":154,"titles":155,"content":156,"level":56},"/blog/branch-manager-azure-devops#getting-started","Getting Started",[124],"Alright, let's get to the interesting part: Installation! You need Node.js 18 or higher and (of course) an Azure DevOps organization. Clone the repo, install dependencies, and start the server: git clone https://github.com/jdgoeij/BranchManager.git\ncd BranchManager/server\nnpm install\nnpm start The app opens at http://localhost:8080. For PAT authentication you are ready to go immediately. Just generate a Code Read and Write PAT and use is. For Entra ID, follow the configuration steps in the README to register the app and add your credentials to server/.env.",{"id":158,"title":159,"titles":160,"content":161,"level":56},"/blog/branch-manager-azure-devops#hosting-it-for-your-team","Hosting It for Your Team",[124],"If you want to make Branch Manager available to your whole team, Azure App Service is the simplest option. The server/ folder is a self-contained Node.js app and deploys directly. The README covers three paths: Azure CLI for the fastest setup, the VS Code Azure App Service extension if you prefer a UI, and a GitHub Actions workflow if you want automated deployments on every push to main. Make sure you add your App Service URL as a redirect URI in your Entra app registration and set REDIRECT_URI as an environment variable on the App Service. Without this, the OAuth redirect after sign-in will not work. The README walks through exactly what to set.",{"id":163,"title":164,"titles":165,"content":166,"level":56},"/blog/branch-manager-azure-devops#what-is-next","What Is Next",[124],"A few things are on my list. The branch table currently loads one project at a time. I want to add a cross-project view so you can see stale branches across your entire organization in one pass. This is a larger API surface but the foundation is already there. I noticed there is a /api/all-branches endpoint on the server that does exactly this. I also want to add a CSV export. Sometimes the right action is not deletion but a review with the team first. Being able to export the filtered branch list with last commit info and committer makes that conversation easier. If you run into something that does not work or have a feature in mind, open an issue on GitHub. The codebase is straightforward enough that contributions are very welcome. Happy to answer questions. Find me on LinkedIn. html pre.shiki code .shcOC, html code.shiki .shcOC{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}",{"id":19,"title":18,"titles":168,"content":169,"level":50},[],"Implement secure workload autonomy using Azure conditional role assignments with Bicep and Azure Verified Modules. In modern cloud environments, finding the right balance between workload autonomy and security control is crucial. While development teams need extensive permissions to manage their resources effectively, security teams must ensure these privileges don't compromise the organization's security posture. Azure's conditional role assignments provide an elegant solution to this challenge, allowing us to grant broad permissions while maintaining strict security boundaries.",{"id":171,"title":172,"titles":173,"content":174,"level":56},"/blog/conditional-role-assignments#the-challenge","The Challenge",[18],"Traditional role-based access control (RBAC) often forces organizations to choose between two suboptimal approaches: Granting full Owner rights, risking security by allowing teams to escalate privilegesImplementing restrictive custom roles, creating operational overhead and potential bottlenecks Conditional role assignments offer a middle ground, enabling us to grant Owner permissions while preventing specific high-risk actions through conditions.",{"id":176,"title":177,"titles":178,"content":179,"level":56},"/blog/conditional-role-assignments#understanding-role-assignment-conditions","Understanding Role Assignment Conditions",[18],"Role assignment conditions in Azure add an extra layer of security by allowing us to specify when and how permissions can be used. These conditions are evaluated at runtime and can reference various attributes of the request context, including: The target resource's propertiesThe type of action being performedThe principal's claimsThe environment context The power of conditions lies in their ability to create fine-grained access controls without sacrificing operational efficiency.",{"id":181,"title":59,"titles":182,"content":183,"level":56},"/blog/conditional-role-assignments#the-requirements",[18],"Before implementing conditional role assignments, ensure you have: Access to Azure with permissions to manage role assignmentsUnderstanding of Azure RBAC and built-in rolesFamiliarity with Bicep or ARM templatesAzure CLI or Azure PowerShell installedBicep installed",{"id":185,"title":186,"titles":187,"content":188,"level":56},"/blog/conditional-role-assignments#implementation-strategy","Implementation Strategy",[18],"In this guide, we'll focus on implementing a secure workload autonomy pattern with the following objectives: Grant Owner permissions to workload teamsPrevent privilege escalation by blocking critical role assignmentsMaintain audit capabilitiesImplement the solution using Infrastructure as Code Let's dive into the technical implementation of these requirements.",{"id":190,"title":191,"titles":192,"content":193,"level":56},"/blog/conditional-role-assignments#role-assignment-configuration","Role Assignment Configuration",[18],"Here's the critical part: we want to grant Owner permissions and at the same time prevent the assignment of privileged roles. We'll achieve this by creating a role assignment with conditions that explicitly block assignments of the following roles: RoleRole Definition IDOwner8e3af657-a8ff-443c-a75c-2fe8c4bcb635User Access Administrator18d7d88d-d35e-4fb5-a5c3-7773c20a72d9Role Based Access Control Administratorf58310d9-a9f6-439a-9e8d-f62e7b41a168",{"id":195,"title":196,"titles":197,"content":198,"level":92},"/blog/conditional-role-assignments#condition-syntax","Condition Syntax",[18,191],"The condition checks whether the action the user performs is allowed. In this case, we prevent the user from creating (write) or deleting role assignments with the Role Definition IDs of the above roles. All other roles are allowed to be assigned. You can play around and add other roles as well, or simply turn it around to only allow roles you define. I use the triple quote to define a multi-line string ('''): condition: '''\n      ((!(ActionMatches{'Microsoft.Authorization/roleAssignments/write'})) OR\n      (@Request[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168})) AND\n      ((!(ActionMatches{'Microsoft.Authorization/roleAssignments/delete'})) OR\n      (@Resource[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168}))\n      ''' This condition ensures that even with Owner permissions, the user cannot grant or delete these privileged roles to/from others.",{"id":200,"title":201,"titles":202,"content":203,"level":92},"/blog/conditional-role-assignments#bicep-implementation","Bicep Implementation",[18,191],"Here's how we implement this in Bicep: param principalId string = '00000000-0000-0000-0000-000000000000' // Replace with the actual principal ID\n\nresource roleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {\n  name: guid(subscription().id, principalId, 'owner-no-privesc')\n  properties: {\n    principalId: principalId\n    roleDefinitionId: '/subscriptions/${subscription().subscriptionId}/providers/Microsoft.Authorization/roleDefinitions/8e3af657-a8ff-443c-a75c-2fe8c4bcb635' // Owner\n    condition: '''\n      ((!(ActionMatches{'Microsoft.Authorization/roleAssignments/write'})) OR\n      (@Request[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168})) AND\n      ((!(ActionMatches{'Microsoft.Authorization/roleAssignments/delete'})) OR\n      (@Resource[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168}))\n      '''\n    conditionVersion: '2.0'\n  }\n}",{"id":205,"title":206,"titles":207,"content":208,"level":56},"/blog/conditional-role-assignments#permissions-at-scale","Permissions at Scale",[18],"When managing many subscriptions, resource groups, or resources, manual RBAC assignments become unmanageable. You need a repeatable, auditable, and secure way to grant and manage access ideally with the ability to: Assign roles to users, groups, or managed identitiesApply conditions for least privilegeTrack and review assignments over time",{"id":210,"title":211,"titles":212,"content":213,"level":56},"/blog/conditional-role-assignments#enter-azure-verified-modules-avm","Enter Azure Verified Modules (AVM)",[18],"The AVM Role Assignment module lets you declaratively manage role assignments at any scope. Let's look at a practical example for assigning the Owner role at the subscription level, with a condition to prevent privilege escalation. The Bicep will look almost the same: targetScope = 'managementGroup'\n\nparam principalId string = ''\n\nparam location string = 'swedencentral'\n\nparam subscriptionId string = '00000000-0000-0000-0000-000000000000' // Default subscription ID\n\nmodule roleAssignment 'br/public:avm/ptn/authorization/role-assignment:0.2.2' = {\n  name: 'roleAssignmentDeployment'\n  params: {\n    // Required parameters\n    principalId: principalId\n    roleDefinitionIdOrName: 'Reader'\n    // Non-required parameters\n    description: 'Role Assignment (subscription scope)'\n    location: location\n    subscriptionId: subscriptionId\n  }\n}",{"id":215,"title":216,"titles":217,"content":218,"level":56},"/blog/conditional-role-assignments#using-the-module-for-scaled-operations","Using the module for scaled operations",[18],"With the module, you can easily assign roles with conditions to multiple principals or scopes using a for loop in Bicep. This approach reduces duplication and minimizes the risk of errors. For example, to assign the Owner role with the privilege escalation prevention condition to several user or group IDs you can pass an array of principals to process. The array will reside in a separate .bicepparam file. targetScope = 'managementGroup'\n\nparam ownerPrincipals array = []\n\nmodule OwnerRoleAssignments 'br/public:avm/ptn/authorization/role-assignment:0.2.2' = [\n  for principal in ownerPrincipals: {\n    name: guid(principal.id, 'owner-no-privesc')\n    params: {\n      principalId: principal.id\n      roleDefinitionIdOrName: 'Owner'\n      condition: principal.condition ?? ''\n      conditionVersion: '2.0'\n      subscriptionId: principal.subscriptionId\n    }\n  }\n] And the .bicepparam file will have the configuration for each principal you want to assign the permissions to. Remember you can always change the other parameters to include them in the bicepparam, like the assigned role etc. The way you can maximize your scaled operations! using 'main.bicep'\n\nparam ownerPrincipals = [\n  {\n    id: '00000000-0000-0000-0000-000000000002'\n    subscriptionId: '00000000-0000-0000-0000-000000000002'\n    condition: '''\n            ((!(ActionMatches{'Microsoft.Authorization/roleAssignments/write'})) OR\n            (@Request[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168})) AND\n            ((!(ActionMatches{'Microsoft.Authorization/roleAssignments/delete'})) OR\n            (@Resource[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168}))\n        '''\n  }\n] This pattern ensures each principal receives the correct assignment, and you only need to update the ownerPrincipals array to manage access at scale.",{"id":220,"title":221,"titles":222,"content":223,"level":56},"/blog/conditional-role-assignments#testing-and-verification","Testing and verification",[18],"I deployed the AVM Bicep module with: az deployment mg create -m 'MyManagementGroupName' --location westeurope --parameters .\\parameters.bicepparam Hint: you should use a pipeline for that, but that's not part of this blog. In the portal, I went to my subscription and checked the role assignments: After clicking View/Edit I saw there was a configuration: Then I checked the conditions:",{"id":225,"title":226,"titles":227,"content":228,"level":56},"/blog/conditional-role-assignments#benefits-and-considerations","Benefits and Considerations",[18],"This approach offers several advantages: Operational Efficiency: Teams are empowered to manage their own resources independently, reducing the need to request additional permissions from administrators.Enhanced Security: Sensitive role assignments are protected, minimizing the risk of privilege escalation and unauthorized access.Simplified Management: There's no longer a need to create and maintain complex custom roles, streamlining access control.Scalable Solution: This approach can be easily implemented across many subscriptions, making it suitable for organizations of any size. However, keep in mind that conditions add complexity to role assignments. Testing is crucial to ensure conditions work as expected. And always keep monitoring and auditing!",{"id":230,"title":231,"titles":232,"content":233,"level":56},"/blog/conditional-role-assignments#compliance","Compliance",[18],"So next time you need to report which privileged roles are assigned, you could simply use your codebase as proof! This multi-layered approach ensures both security and operational efficiency. As always, leave a comment on LinkedIn if you have any questions. Happy coding! ☕ html pre.shiki code .slsVL, html code.shiki .slsVL{--shiki-light:#24292E;--shiki-default:#E1E4E8;--shiki-dark:#E1E4E8}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .so5gQ, html code.shiki .so5gQ{--shiki-light:#D73A49;--shiki-default:#F97583;--shiki-dark:#F97583}html pre.shiki code .sCsY4, html code.shiki .sCsY4{--shiki-light:#6A737D;--shiki-default:#6A737D;--shiki-dark:#6A737D}html pre.shiki code .shcOC, html code.shiki .shcOC{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0}html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}",{"id":23,"title":22,"titles":235,"content":236,"level":50},[],"Automate banned password list management across multiple Entra ID tenants using Microsoft Graph API, PowerShell and Azure DevOps. In today's complex cloud environments, managing password security across multiple tenants is a critical challenge for IT administrators. Microsoft Entra ID provides powerful mechanisms to implement centralized password policies, but effective implementation requires careful planning and robust automation.",{"id":238,"title":239,"titles":240,"content":241,"level":56},"/blog/entraid-banned-password-list#graph-api","Graph API",[22],"Microsoft Graph API revolutionizes Entra ID tenant management by providing a powerful automation framework that simplifies complex multi-tenant configurations. With this API, you can effortlessly streamline identity and security management through comprehensive bulk operations, including: Automated user creationEfficient group managementPrivileged Identity Management (PIM) role assignmentsGranular password policy enforcement The API's finely tuned permission model lets you stay in control with precision, ensuring every automation script aligns with the principle of least privilege. This not only keeps your operations smooth but also fortifies your identity infrastructure's security. By embracing Microsoft Graph API, you can shift from time-consuming manual processes to efficient, repeatable workflows that boost consistency and cut down on administrative overhead. Sounds like a win, right? Let's explore how to make it happen.",{"id":243,"title":59,"titles":244,"content":245,"level":56},"/blog/entraid-banned-password-list#the-requirements",[22],"Before diving into the implementation, ensure you have: An IDE (such as Visual Studio Code)Entra ID Premium licenseAzure DevOps access with sufficient permissionsService connection(s) that have the Authentication Policy Administrator permissionProficiency in PowerShell\nAzure PowerShell moduleMicrosoft Graph module With these prerequisites ready, you're all set to start building automated solutions that simplify your tenant management.",{"id":247,"title":248,"titles":249,"content":250,"level":56},"/blog/entraid-banned-password-list#scope","Scope",[22],"Today we will focus on the following subjects: Centralizing banned password managementSupporting multi-tenant password restrictionsAutomating policy deployment through Azure DevOps To prevent the blog from becoming too long, a few topics are out-of-scope: Multi-tenant deployment strategySetting up Service Connections in Azure DevOpsCreating branch policies in Azure DevOpsVarious testing/error handling",{"id":252,"title":64,"titles":253,"content":254,"level":56},"/blog/entraid-banned-password-list#configuration-steps",[22],"Before we dive into the code, I will summarize the order of what we will do: Create/prepare the required payload (body) files for the APIDraft a PowerShell script to update the settings in Entra IDCreate a YAML-pipeline for Azure DevOps that runs automatically In the end you will have a basic automated way to update Banned Password Lists in multiple Entra ID tenants!",{"id":256,"title":257,"titles":258,"content":259,"level":56},"/blog/entraid-banned-password-list#code-implementation","Code Implementation",[22],"The code for this blog is hosted on my public Github repository.",{"id":261,"title":262,"titles":263,"content":264,"level":92},"/blog/entraid-banned-password-list#folder-structure","Folder structure",[22,257],"To be able to create multi-tenant deployments, we are going to parameterize the banned password list settings per tenant. Therefore, we need a consistent folder structure to support this type of deployment. bannedPasswords\n├── code\n│   ├── Set-PasswordSettings.ps1\n├── parameters\n│   ├── passwordSettings.json\n│   ├── bannedPasswords-tenantA.json\n│   └── bannedPasswords-tenantB.json\n├── pipelines\n│   ├── set-password-settings.yaml",{"id":266,"title":267,"titles":268,"content":269,"level":92},"/blog/entraid-banned-password-list#uri-background","Uri background",[22,257],"To set the banned password list, we need to update the Entra ID setting 'Password Rule Settings'. This one is currently in beta only, so as always be aware things may change in the future. See the docs for more information. The URI is: URI https://graph.microsoft.com/beta/settings Getting the settings returns the following setting types: id          : xxxxxxxx-d947-4d19-a028-xxxxxxxxxxxx\ndisplayName : Group.Unified\ntemplateId  : 62375ab9-6b52-47ed-826b-58e47e0e304b\nvalues      : {…}\n\nid          : xxxxxxxx-7701-4e25-8c81-xxxxxxxxxxxx\ndisplayName : Password Rule Settings\ntemplateId  : 5cf42378-d67d-4f36-ba46-e8b86229381d\nvalues      : {…} The API URI targets 'settings', which manages multiple Entra ID settings. Therefore, we target 'Password Rule Settings' to validate we use the correct template ID to use in our URI. This will become clear in the PowerShell code snippet below.",{"id":271,"title":272,"titles":273,"content":274,"level":92},"/blog/entraid-banned-password-list#createprepare-the-required-payload-body-files-for-the-api","Create/prepare the required payload (body) files for the API",[22,257],"The Password Rule Settings expects a JSON-file body (passwordSettings.json) that contains the settings to update. The file has the following structure: {\n  \"templateId\": \"5cf42378-d67d-4f36-ba46-e8b86229381d\",\n  \"values\": [\n    {\n      \"name\": \"BannedPasswordCheckOnPremisesMode\",\n      \"value\": \"Enforce\"\n    },\n    {\n      \"name\": \"EnableBannedPasswordCheckOnPremises\",\n      \"value\": \"True\"\n    },\n    {\n      \"name\": \"EnableBannedPasswordCheck\",\n      \"value\": \"True\"\n    },\n    {\n      \"name\": \"LockoutDurationInSeconds\",\n      \"value\": \"60\"\n    },\n    {\n      \"name\": \"LockoutThreshold\",\n      \"value\": \"10\"\n    },\n    {\n      \"name\": \"BannedPasswordList\",\n      \"value\": \"placeholder\"\n    }\n  ]\n} You may be thinking: why don't I add all the banned passwords in this file? A valid point, however since we want to support multiple tenants with the ability to differentiate, I chose to use separate files containing the banned passwords for each tenant.",{"id":276,"title":277,"titles":278,"content":279,"level":103},"/blog/entraid-banned-password-list#banned-passwords","Banned passwords",[22,257,272],"Each tenant will have a JSON-file containing the list of banned passwords. The file will be merged with the main parameter file and the merged file will be added as body to the request. bannedPasswords-tenantA.json: [\n  \"secret\",\n  \"123456\",\n  \"password\",\n  \"qwerty123\",\n  \"qwerty1\",\n  \"123456789\",\n  \"password1\",\n  \"12345678\",\n  \"12345\",\n  \"abc123\",\n  \"qwerty\",\n  \"iloveyou\",\n  \"Password\",\n  \"baseball\",\n  \"1234567\",\n  \"111111\",\n  \"princess\",\n  \"football\",\n  \"monkey\",\n  \"sunshine\"\n]",{"id":281,"title":282,"titles":283,"content":284,"level":103},"/blog/entraid-banned-password-list#quirky-body","Quirky body",[22,257,272],"What do I mean with 'quirky' body? The fact that the property BannedPasswordList expects tab-separated values instead of comma-separated values. This will need to be taken into account in the PowerShell script.",{"id":286,"title":287,"titles":288,"content":289,"level":92},"/blog/entraid-banned-password-list#powershell-script","PowerShell script",[22,257],"I will break down the PowerShell script in the following steps: Create a function to execute API operations on the 'settings' endpoint using an access tokenCombine the banned password list with the parameter fileUpdate the settings in Entra ID Let's initialize the script: [CmdLetBinding()]\nParam (\n    [Parameter(Mandatory,\n        HelpMessage = \"Enter the path of the parameter folder of authentication methods setting.\")]\n    [String]$ParameterFolderPath,\n    [Parameter(Mandatory,\n        HelpMessage = \"Enter the file path of the banned password list.\")]\n    [String]$TenantBannedPasswordsFilePath\n) Now we declare the function to update the Entra ID settings via the beta endpoint: function Set-EntraIdSetting {\n    param (\n        [Parameter(Mandatory,\n            HelpMessage = \"Provide the name of the settings to create/update.\")]\n        [Object]$TargetSettingName,\n        [Parameter(Mandatory,\n            HelpMessage = \"Provide the file path of the settings to create/update.\")]\n        [Object]$SettingFilePath\n    )\n    # Get the access token for the Microsoft Graph API\n    $settingsUri = \"https://graph.microsoft.com/beta/settings\"\n\n    Write-Output \"##[command]Get access token for the Microsoft Graph API\"\n    $accessToken = (Get-AzAccessToken -ResourceTypeName MSGraph -AsSecureString).Token\n    # set the params needed for the REST API requests\n    $params = @{\n        Method         = 'Get'\n        Uri            = $settingsUri\n        Authentication = 'Bearer'\n        Token          = $accessToken\n        ContentType    = 'application/json'\n    }\n    # Wrap the request in a try catch to ensure stopping errors\n    try {\n        $request = (Invoke-RestMethod @params).value\n    }\n    catch {\n        Throw $_\n    }\n    # Check if the request variable has a value\n    if ($request) {\n        Write-Output \"##[command]Found settings. Checking for setting '$TargetSettingName'\"\n        $targetSettingObject = $request | Where-Object { $_.displayName -eq $TargetSettingName }\n    }\n    # Continue checking if we have targeted the correct settings, and update the params accordingly\n    if ($targetSettingObject) {\n        Write-Output \"##[command]Found existing $TargetSettingName. Updating setting according to provided config.\"\n        $passwordSettingsUri = $settingsUri + '/' + $targetSettingObject.id\n        $params.Uri = $passwordSettingsUri\n        $params.Method = 'Patch'\n        $body = Get-Content -Path $SettingFilePath | ConvertFrom-Json -Depth 10\n        $body.PSObject.properties.remove('templateId')\n        $jsonBody = $body | ConvertTo-Json -Depth 10\n        try {\n            $settingRequest = Invoke-RestMethod @params -Body $jsonBody\n        }\n        catch {\n            throw $_\n        }\n    }\n    # Check if the setting does not exist. If this is the case we just post the entire template.\n    elseif (!$targetSettingObject) {\n        Write-Output \"##[command]No existing '$TargetSettingName'. Creating new '$TargetSettingName' according to provided config.\"\n        $jsonBody = Get-Content -Path $SettingFilePath\n\n        $params.Method = 'Post'\n\n        try {\n            $settingRequest = Invoke-RestMethod @params -Body $jsonBody\n        }\n        catch {\n            throw $_\n        }\n    }\n    return $settingRequest\n} Now we update the banned password list values of the parameter file: Write-Output \"##[command]Updating banned password list\"\n$bannedPasswords = Get-Content -Path $TenantBannedPasswordsFilePath | ConvertFrom-Json\n$bannedPasswordsList = $null\n$tab = [char]9\n\nWrite-Output \"##[command]Looping the banned password list and adding tabs needed for the REST API call.\"\nforeach ($bannedPassword in $bannedPasswords) {\n    $bannedPasswordsList += $bannedPassword + $tab\n}\n\nWrite-Output \"##[command]Trimming the banned password list to exclude the last tab.\"\n$trimmedPasswordList = $bannedPasswordsList -replace \".{1}$\"\n$bannedPasswordsSetting = Get-Content -Path \"$ParameterFolderPath\\passwordSettings.json\" | ConvertFrom-Json -Depth 5 -AsHashtable\n    ($bannedPasswordsSetting.values | Where-Object { $_.name -eq 'BannedPasswordList' }).value = $trimmedPasswordList\n$bannedPasswordsSetting | ConvertTo-Json -Depth 5 | Out-File \"$ParameterFolderPath\\updatedPasswordSettings.json\" After this we have successfully created a new JSON-file containing the entire configuration (parameters). This file will be used as body in the next step, where we present it to our previously written function: try {\n    Set-EntraIdSetting -TargetSettingName 'Password Rule Settings' -SettingFilePath \"$ParameterFolderPath\\updatedPasswordSettings.json\"\n    Write-Output \"Settings updated successfully!\"\n}\ncatch {\n    throw\n}",{"id":291,"title":292,"titles":293,"content":294,"level":92},"/blog/entraid-banned-password-list#azure-devops-yaml-pipeline","Azure DevOps YAML Pipeline",[22,257],"The pipeline is where the magic sauce comes into play. Make sure you have service connections setup to the various tenants you wish to manage. For each tenant you can add a separate stage, as per the example below. Because this is a small configuration change, the pipeline is not that complicated to set up. First let's add the correct trigger: trigger:\n  branches:\n    include:\n      - main\n  paths:\n    include:\n      - bannedPasswords/parameters Triggering on the branch and path makes sure the CI/CD runs as expected. You can add more checks before the first deployment stage, or better: add a test tenant stage to validate your configuration before pushing to production. Next we configure the pool, variables and stages of the pipeline. That's all it needs to deploy on every commit to main in the designated feature parameter folder. pool:\n  vmImage: ubuntu-latest\n\nvariables:\n  - name: ParameterFolderPath\n    value: bannedPasswords/parameters\n\nstages:\n  - stage: TenantA\n    jobs:\n      - job: TenantA\n        displayName: Updating Password Settings in Tenant A\n        steps:\n          - task: AzurePowerShell@5\n            displayName: Setting the configuration\n            inputs:\n              azureSubscription: \"TenantA-AuthenticationMethods-SPN\"\n              ScriptType: \"FilePath\"\n              ScriptPath: \"$(System.DefaultWorkingDirectory)/bannedPasswords/code/Set-PasswordSettings.ps1\"\n              ScriptArguments:\n                -ParameterFolderPath \"$(System.DefaultWorkingDirectory)/$(ParameterFolderPath)\"\n                -TenantBannedPasswordsFilePath \"$(System.DefaultWorkingDirectory)/$(ParameterFolderPath)/bannedPasswords-TenantA.json\"\n              azurePowerShellVersion: LatestVersion Adding another tenant is as easy as copy-pasting the previous stage and changing the parameters: - stage: TenantB\n  jobs:\n    - job: TenantB\n      displayName: Updating Password Settings in Tenant B\n      steps:\n        - task: AzurePowerShell@5\n          displayName: Setting the configuration\n          inputs:\n            azureSubscription: \"TenantB-AuthenticationMethods-SPN\"\n            ScriptType: \"FilePath\"\n            ScriptPath: \"$(System.DefaultWorkingDirectory)/bannedPasswords/code/Set-PasswordSettings.ps1\"\n            ScriptArguments:\n              -ParameterFolderPath \"$(System.DefaultWorkingDirectory)/$(ParameterFolderPath)\"\n              -TenantBannedPasswordsFilePath \"$(System.DefaultWorkingDirectory)/$(ParameterFolderPath)/bannedPasswords-TenantB.json\"\n            azurePowerShellVersion: LatestVersion",{"id":296,"title":297,"titles":298,"content":299,"level":92},"/blog/entraid-banned-password-list#running-the-pipeline","Running the pipeline",[22,257],"Let's see what my pipeline does when I run it… Success!",{"id":301,"title":302,"titles":303,"content":304,"level":56},"/blog/entraid-banned-password-list#potential-pitfalls-and-best-practices","Potential Pitfalls and Best Practices",[22],"This setup makes you versatile in configuring banned passwords for your environment(s). As always, stay aware of the pitfalls: Ensure password policies align with specific tenant compliance requirementsAlways implement a 4-eyes principle approval workflow in your automationTest thoroughly in staged environmentsRegularly review banned password lists and update accordinglyImplement comprehensive loggingUse the principle of least privilege for your automation accounts",{"id":306,"title":116,"titles":307,"content":308,"level":56},"/blog/entraid-banned-password-list#conclusion",[22],"That concludes this blog about configuring banned password lists via the Graph API with PowerShell. We have successfully: ✅ Created a folder structure for our files✅ Wrote an intermediate PowerShell script, including a function, to configure the Entra ID settings✅ Added a YAML-pipeline to automatically deploy the code to Entra ID Hopefully this provides you with a jump start into managing Entra ID(s) in an automated fashion. You can easily expand on this automation by adding new configuration of Entra ID like Conditional Access Policies or even PIM Eligible Role Assignments! As always, leave a comment on LinkedIn if you have any more questions. Happy coding! ☕ html pre.shiki code .slsVL, html code.shiki .slsVL{--shiki-light:#24292E;--shiki-default:#E1E4E8;--shiki-dark:#E1E4E8}html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .so5gQ, html code.shiki .so5gQ{--shiki-light:#D73A49;--shiki-default:#F97583;--shiki-dark:#F97583}html pre.shiki code .sQHwn, html code.shiki .sQHwn{--shiki-light:#E36209;--shiki-default:#FFAB70;--shiki-dark:#FFAB70}html pre.shiki code .shcOC, html code.shiki .shcOC{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0}html pre.shiki code .sCsY4, html code.shiki .sCsY4{--shiki-light:#6A737D;--shiki-default:#6A737D;--shiki-dark:#6A737D}html pre.shiki code .sByVh, html code.shiki .sByVh{--shiki-light:#22863A;--shiki-default:#85E89D;--shiki-dark:#85E89D}",{"id":27,"title":26,"titles":310,"content":311,"level":50},[],"How switching from Hugo to Nuxt opened the door to vibe coding with GenAI and why a mature framework makes all the difference when you want to build, explore, and experiment fast. Last year I was running this blog on Hugo. It was fine. Hugo is fast, reliable, and battle-tested. I have nothing bad to say about it. But over time, I kept running into a wall. I wanted to be able to spice things up a bit with theming, animations/transitions and other features I didn't know I wanted (looking at you RSS feed). I found myself fighting the framework rather than building with it because of this. Over the last year I tried the vibe code my new blog on various occasions. The last time was October 2025. Although it brought me further, I still was correcting a lot of output. But then Opus 4.6 (and now Sonnet 4.6) hit the market. What a difference, everything changed. I wanted to try it again. With this website as a result.",{"id":313,"title":314,"titles":315,"content":316,"level":56},"/blog/from-hugo-to-nuxt-vibe-coding#what-vibe-coding-actually-means-to-me","What \"Vibe Coding\" Actually Means to Me",[26],"I want to be clear about what I mean by vibe coding, because it gets thrown around a lot. For me, it is not about blindly pasting AI output and hoping for the best. It is about having a fast, creative back-and-forth with a model where I describe what I want, in plain language or by pointing at code, and the model helps me realize it. I stay in control. I understand what lands in the codebase. But the friction between \"idea\" and \"working thing\" drops dramatically. For that to work well, I wanted a framework that has a lot of online presence and the model knows deeply. Hugo is a niche static site generator. It has its own templating language, its own directory conventions, its own quirks. When I asked a model to help me extend something in Hugo, I spent a lot of time correcting misunderstandings. The model knew Go templates and Hugo's data pipeline at a surface level, at best. Vue and Nuxt? The models know those inside out. Every pattern, every composable, every Tailwind class. The conversation just flows a lot better.",{"id":318,"title":319,"titles":320,"content":321,"level":56},"/blog/from-hugo-to-nuxt-vibe-coding#why-nuxt-specifically","Why Nuxt Specifically",[26],"I considered a few options. Next.js was an obvious candidate since React is everywhere and models are very strong with it. But I have always preferred Vue's approach to component design. The single-file component format, the reactivity model, the way templates stay readable. It suits how I think. Nuxt builds on Vue and fills in everything you need for a real content site: file-based routing, server routes, auto-imports, a content layer built around Markdown. It is not a toy framework. Companies ship production applications with it. That maturity matters, because it means the patterns I learn and the things I build are not throwaway experiments. They are transferable. The Nuxt Content module in particular was the deciding factor. My posts are Markdown files, and they always will be. Nuxt Content treats them as a first-class data source. I can query posts, filter by tag, sort by date, and render MDC components inside Markdown, all without reaching for a CMS or a third-party API.",{"id":323,"title":324,"titles":325,"content":326,"level":56},"/blog/from-hugo-to-nuxt-vibe-coding#the-migration","The Migration",[26],"Migrating the actual content was straightforward. Hugo and Nuxt both expect Markdown with YAML frontmatter, so my posts moved over without changes beyond a few field name adjustments. The real work was building the site itself: the layout, navigation, search, tag pages, and RSS feed. And this is exactly where vibe coding paid off. In Hugo it would cost me a lot more time. In Nuxt, I described what I wanted, iterated in short loops with AI assistance, and had something I was proud of within a weekend. Not every suggestion landed perfectly. There were moments where I needed to read the Nuxt docs or dig into how a composable actually worked. But that is a healthy part of the process. I understand this codebase. I just built it faster than I ever could have on my own. It's true what they say: understanding a language is easier than speaking it. You could say the same about programming languages. I understand variables, arrays, loops, if/else statements. But in every languague you have to get to know the syntax properly before you can start flying. With my Copilot, I found this part to be particularly fast-tracked.",{"id":328,"title":329,"titles":330,"content":331,"level":56},"/blog/from-hugo-to-nuxt-vibe-coding#what-changes-when-you-use-a-mature-framework","What Changes When You Use a Mature Framework",[26],"There is an underappreciated advantage to using a framework with a large ecosystem: the guard rails are already built. Nuxt handles code splitting, hydration, SEO meta, image optimization, and TypeScript out of the box or with a single module install. I do not have to invent solutions to problems that have already been solved a hundred times. This matters even more when working with GenAI. When I ask for help with something in Nuxt, the model can suggest an idiomatic solution, one that fits the framework's conventions. In a niche tool, the model improvises. In Nuxt, it suggests useAsyncData, definePageMeta, a server/routes/ file. Things that actually exist and work the way they are supposed to. The result is that my blog is now more capable than it ever was on Hugo. It has live search across all post content, tag filtering, a proper RSS feed, dark and light mode, and responsive design. The code is clean enough that I can keep extending it with confidence. Now the only thing that is missing is... Content.",{"id":333,"title":334,"titles":335,"content":336,"level":56},"/blog/from-hugo-to-nuxt-vibe-coding#exploring-genai-as-a-daily-tool","Exploring GenAI as a Daily Tool",[26],"I want to be honest: I am a Cloud Architect by trade, not a frontend developer. JavaScript frameworks are not my primary home. What surprised me most about this project is how much I learned by doing it this way. When the model explained why a particular reactive pattern works in Vue, or suggested a server route instead of a client-side fetch, I paid attention. I looked things up. I built a working mental model. GenAI is at its best when it accelerates genuine learning rather than bypassing it. If I had just accepted every code block without reading it, I would have a site I could not maintain. Instead I have a site I understand well enough to keep improving, and a framework I am now genuinely comfortable with. That feels like the right way to use these tools. My approach was simple: Start anew with an empty Git repo.Don't let AI build you scaffold: build it yourself following official documentation.When I had the starter website working and running, I commit this code. This is now my baseline.From here on out I started iterating:\nFirst I set the theme colors. Is it to my liking? Commit!Then I started working on the various pages. Commit!Menu bar. Commit!etc.",{"id":338,"title":339,"titles":340,"content":341,"level":56},"/blog/from-hugo-to-nuxt-vibe-coding#whats-next","What's Next",[26],"Now that the foundation is solid, I want to keep pushing on what a personal tech blog can be. A few things I am thinking about: Reading progress indicator on long postsRelated posts suggestions based on tag overlapNewsletter signup without a third-party service, handled by a Nuxt server routeAutomated post metadata, meaning generating descriptions and reading time during build All of these are things I would not have touched on Hugo (although provided out of the box) In Nuxt, with good tooling and GenAI on my side, they feel totally within control. If you are sitting on a static site generator that is starting to feel limiting, I would encourage you to take a serious look at Nuxt. The migration effort is real but manageable, and what you get on the other side is a full-stack web framework backed by a huge ecosystem, paired with the most capable AI coding tools we have ever had. That is genuinely exciting. Happy to answer questions. Find me on LinkedIn.",{"id":31,"title":30,"titles":343,"content":344,"level":50},[],"A complete guide to setting up Ubuntu 24.04 LTS with Intune, including the Intune Portal, Microsoft Edge, development tools, and more. In this guide, I'll walk you through setting up Ubuntu 24.04 LTS with Intune. As a Cloud Architect at Rubicon B.V., I've been testing whether Ubuntu provides the Edge (pun intended) I need to fulfill my work activities. Specifically, I'll cover how to install the Intune Portal as well as the software I used for my Ubuntu 24.04 installation. You will find instructions for every installation below. I hope this helps you if you have any issues enrolling Ubuntu 24.04 with Intune.",{"id":346,"title":347,"titles":348,"content":349,"level":56},"/blog/intune-ubuntu-24-04#steps","Steps",[30],"Here's a list of things that we're going through in this post: #SoftwarePurposeInstallation1.Microsoft EdgeCompany device managementapt2.Intune PortalCompany device managementapt3.Microsoft 365, including Teams and OutlookOffice activitiesPWA4.Draw.ioCreating designsSnap5.VS CodeDevelopmentSnap6.PowerShellDevelopmentSnap7.KeepassXCPassword managementSnap8.Azure CLIDevelopmentapt9.BicepDevelopmentbinary10.DisplayLinkMulti-monitor supportapt",{"id":351,"title":352,"titles":353,"content":354,"level":56},"/blog/intune-ubuntu-24-04#software-on-beta-branch-of-ubuntu-2404-lts","Software on Beta branch of Ubuntu 24.04 LTS",[30],"When I started going down this road, Ubuntu 24.04 was still in beta. Installing software on a pre-release version of Ubuntu can be challenging. Typically, I prefer to keep packages as close to the source as possible. This means either installing from the official repository using apt, or adding the developer's repository and then installing with apt. Initially, I hesitated about using Snap packages due to concerns about their larger size and potential performance impact compared to APT packages. However, when dealing with a beta version of Ubuntu 24.04 LTS, options become limited. Lack of up-to-date documentation and repositories often leads to tinkering with apt sources and keyrings. This process involves navigating dependencies and version pinning, which can be error-prone. By opting for Snap, I streamlined the installation process, making it more straightforward and reliable. Update 25-05-2024: Ubuntu 24.04 LTS was officially released. Still, taking into account software release cycles it is expected many applications have not yet found their way into the 24.04 repositories.",{"id":356,"title":357,"titles":358,"content":359,"level":56},"/blog/intune-ubuntu-24-04#manual-installation-of-the-intune-portal","Manual installation of the Intune Portal",[30],"The Intune portal is provided (and officially supported) for Ubuntu 22.04. By adding backport repositories it is possible to install it on 24.04 without compatibility issues. Follow the steps below to install the Intune Portal application. Edit /etc/apt/sources.list.d/ubuntu.sources and: Make sure you have both noble sources and mantic sourcesAdd an entry for mantic-security as well Types: deb\nURIs: http://nl.archive.ubuntu.com/ubuntu/\nSuites: mantic\nComponents: main restricted universe multiverse\nSigned-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg\n\nTypes: deb\nURIs: http://security.ubuntu.com/ubuntu/\nSuites: mantic-security\nComponents: main restricted universe multiverse\nSigned-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg The file /etc/apt/sources.list.d/ubuntu.sources should look like the code block below: Types: deb\nURIs: http://archive.ubuntu.com/ubuntu\nSuites: noble noble-updates noble-backports\nComponents: main restricted universe multiverse\nSigned-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg\n\nTypes: deb\nURIs: http://security.ubuntu.com/ubuntu/\nSuites: noble-security\nComponents: main restricted universe multiverse\nSigned-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg\n\nTypes: deb\nURIs: http://nl.archive.ubuntu.com/ubuntu/\nSuites: mantic\nComponents: main restricted universe multiverse\nSigned-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg\n\nTypes: deb\nURIs: http://security.ubuntu.com/ubuntu/\nSuites: mantic-security\nComponents: main restricted universe multiverse\nSigned-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg This will ensure you have access to 22.04 (mantic) packages which we need during the next phase. Install Microsoft Edge for Business. Edge is needed for the Intune Portal as it leverages the built-in authentication mechanisms. curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg\nsudo install -o root -g root -m 644 microsoft.gpg /etc/apt/trusted.gpg.d/\nsudo sh -c 'echo \"deb [arch=amd64] https://packages.microsoft.com/repos/edge stable main\" > /etc/apt/sources.list.d/microsoft-edge-dev.list'\nsudo rm microsoft.gpg\nsudo apt update && sudo apt install microsoft-edge-stable Install the prerequisites for the Intune Portal: sudo apt install openjdk-11-jre libicu72 libjavascriptcoregtk-4.0-18 libwebkit2gtk-4.0-37 Install intune-portal: curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg\nsudo install -o root -g root -m 644 microsoft.gpg /usr/share/keyrings/\nsudo sh -c 'echo \"deb [arch=amd64 signed-by=/usr/share/keyrings/microsoft.gpg] https://packages.microsoft.com/ubuntu/22.04/prod jammy main\" > /etc/apt/sources.list.d/microsoft-ubuntu-jammy-prod.list'\nsudo rm microsoft.gpg\nsudo apt update\nsudo apt install intune-portal Sign in and smile! (note: It can take up to 1 hour for it to sync, please be patient)",{"id":361,"title":362,"titles":363,"content":364,"level":56},"/blog/intune-ubuntu-24-04#using-an-older-version-of-the-microsoft-identity-broker-package","Using an older version of the Microsoft Identity Broker package",[30],"I tested the latest microsoft-identity-broker package, and it now works with the Intune Portal. Please use the latest version where possible! If you need microsoft-identity-broker v.1.7.0 follow these steps: sudo apt purge microsoft-identity-broker\nsudo apt install microsoft-identity-broker=1.7.0\nsudo apt-mark hold microsoft-identity-broker If you use microsoft-identity-broker v.1.7.0 and want to go to the latest version, follow these steps: sudo apt-mark unhold microsoft-identity-broker\nsudo apt purge microsoft-identity-broker\nsudo apt install microsoft-identity-broker Purge intune-portal from apt and install it once again so it uses the latest Microsoft Identity Broker: sudo apt purge intune-portal\nsudo apt install intune-portal Sign in and smile! (note: It can take up to 1 hour for it to sync, please be patient)",{"id":366,"title":367,"titles":368,"content":369,"level":56},"/blog/intune-ubuntu-24-04#other-software","Other software",[30],"The other software I installed are mostly for me to be able to do my daily work activities. Your software suite may vary. To give you a complete picture I outlined the software in the next chapters.",{"id":371,"title":372,"titles":373,"content":374,"level":92},"/blog/intune-ubuntu-24-04#snap-packages","Snap packages",[30,367],"These snaps work like a charm: PowerShellVS CodeKeepassXCDraw.io sudo snap install powershell vscode keepassxc drawio",{"id":376,"title":377,"titles":378,"content":379,"level":56},"/blog/intune-ubuntu-24-04#apt-packages","apt packages",[30],"Apt packages that work without issues on Ubuntu 24.04 LTS: gitcurlgnome-tweaks sudo apt install git curl gnome-tweaks",{"id":381,"title":382,"titles":383,"content":384,"level":92},"/blog/intune-ubuntu-24-04#progressive-web-app-pwa","Progressive Web App (PWA)",[30,377],"To be able to leverage Microsoft's Office suite and Teams client you can install them as PWA on the system. I've installed: OutlookMicrosoft 365Teams (v2)OneNote Installation can be done via your specific browser. I used Edge and pinned the PWA's to my dock.",{"id":386,"title":387,"titles":388,"content":389,"level":56},"/blog/intune-ubuntu-24-04#azure-cli-bicep","Azure CLI & Bicep",[30],"Azure CLI has no official candidate for 24.04, but you can use 22.04 just fine (link): curl -sLS https://packages.microsoft.com/keys/microsoft.asc |\n  sudo gpg --dearmor -o /etc/apt/keyrings/microsoft.gpg\nsudo chmod go+r /etc/apt/keyrings/microsoft.gpg\nAZ_DIST='jammy'\necho \"Types: deb\nURIs: https://packages.microsoft.com/repos/azure-cli/\nSuites: ${AZ_DIST}\nComponents: main\nArchitectures: $(dpkg --print-architecture)\nSigned-by: /etc/apt/keyrings/microsoft.gpg\" | sudo tee /etc/apt/sources.list.d/azure-cli.sources\nsudo apt-get update\nsudo apt-get install azure-cli For Bicep get the latest binary (link): curl -Lo bicep https://github.com/Azure/bicep/releases/latest/download/bicep-linux-x64\nchmod +x ./bicep\nsudo mv ./bicep /usr/local/bin/bicep\nbicep --help",{"id":391,"title":392,"titles":393,"content":394,"level":56},"/blog/intune-ubuntu-24-04#displaylink","DisplayLink",[30],"To support multi-monitor setups you need DisplayLink software from Synapse. You can install DisplayLink with these commands: If you are using secure boot: Follow the steps on this page or see the GIF below. wget -P ~/Downloads https://www.synaptics.com/sites/default/files/Ubuntu/pool/stable/main/all/synaptics-repository-keyring.deb | sudo apt install ~/Downloads/synaptics-repository-keyring.deb\nsudo apt update\nsudo apt install displaylink-driver\nrm ~/Downloads/synaptics-repository-keyring.deb",{"id":396,"title":397,"titles":398,"content":399,"level":56},"/blog/intune-ubuntu-24-04#further-git-configuration","Further GIT configuration",[30],"To integrate git secrets with the gnome-keyring you have to compile the git-credential-libsecret: sudo apt-get install -y libsecret-tools\nsudo apt-get install -y gcc make libsecret-1-0 libsecret-1-dev\ncd /usr/share/doc/git/contrib/credential/libsecret\nsudo make\ngit config --global credential.helper /usr/share/doc/git/contrib/credential/libsecret/git-credential-libsecret\nsudo apt purge libsecret-1-dev -y && sudo apt autoremove -y After the configuration you execute git commands on your repo, fill in the password at the prompt and it will be saved to the Gnome Keyring.",{"id":401,"title":116,"titles":402,"content":403,"level":56},"/blog/intune-ubuntu-24-04#conclusion",[30],"And there you have it: my Ubuntu 24.04 installation, seamlessly integrated with Intune. After a week of working with this setup, I can confidently say it's both robust and lightning-fast! Even on an Intel i7 7700HQ, the performance is impressive, so if you're using newer hardware, expect an even smoother experience. Now, I'm curious—what's your experience been like with Ubuntu and Intune? html pre.shiki code .shcOC, html code.shiki .shcOC{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .so5gQ, html code.shiki .so5gQ{--shiki-light:#D73A49;--shiki-default:#F97583;--shiki-dark:#F97583}html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}html pre.shiki code .slsVL, html code.shiki .slsVL{--shiki-light:#24292E;--shiki-default:#E1E4E8;--shiki-dark:#E1E4E8}",{"id":35,"title":34,"titles":405,"content":406,"level":50},[],"Combine PIM eligible roles with conditional role assignments to give teams just-in-time Owner access while preventing privilege escalation. Welcome back! If you haven't seen my deep dive on conditional role assignments with Bicep make sure to read that first. Because I left a major flaw in that example code. I assigned a permanently active 'Owner' role assignment. Of course, this is not a realistic scenario. To manage your Azure resources safely, we need to have Privileged Identity Management (PIM)! Let's iterate further on my previous blog and see how you can combine PIM with role assignment conditions to keep your landing zones secure.",{"id":408,"title":409,"titles":410,"content":411,"level":56},"/blog/pim-conditional-role-assignments#what-you-need","What you need",[34],"Azure AD P2 or Entra ID PremiumPIM enabledAzure PowerShell (Az module)Permission to create PIM role assignment schedule requestsFamiliarity with conditional role assignments (see my previous post!)",{"id":413,"title":414,"titles":415,"content":416,"level":56},"/blog/pim-conditional-role-assignments#the-scenario","The scenario",[34],"Let's say you want to make a user or group eligible for Owner at the subscription level, but you want to make sure that when they activate they can't assign Owner, User Access Administrator, or RBAC Admin to anyone else. We'll use a conditional role assignment to enforce this policy, just as in my previous post.",{"id":418,"title":419,"titles":420,"content":421,"level":56},"/blog/pim-conditional-role-assignments#step-1-write-the-condition","Step 1: Write the condition",[34],"The condition is almost identical to what we used for regular role assignments. We're blocking create (write) and delete actions for the three privileged roles. All other roles are allowed. Here's the condition: $condition = @\"\n((!(ActionMatches{'Microsoft.Authorization/roleAssignments/write'})) OR\n(@Request[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168})) AND\n((!(ActionMatches{'Microsoft.Authorization/roleAssignments/delete'})) OR\n(@Resource[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168}))\n\"@",{"id":423,"title":424,"titles":425,"content":426,"level":56},"/blog/pim-conditional-role-assignments#step-2-create-the-pim-assignment-with-condition","Step 2: Create the PIM assignment (with condition!)",[34],"We'll use PowerShell to create a PIM eligible assignment for Owner, but with our condition attached. That way, when someone activates Owner, they're still blocked from assigning those privileged roles. Here's how we can do this: # Prerequisites: You should already have your $headers (see my PIM as code post)\n\n$subscription = Get-AzSubscription -SubscriptionId 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' # > replace this with your own subscription ID\n# Switch to the target subscription\nSet-AzContext -Subscription $subscription\n$principalId = '00000000-0000-0000-0000-000000000002' ## your Entra group/user ID\n$condition = @\"\n((!(ActionMatches{'Microsoft.Authorization/roleAssignments/write'})) OR\n(@Request[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168})) AND\n((!(ActionMatches{'Microsoft.Authorization/roleAssignments/delete'})) OR\n(@Resource[Microsoft.Authorization/roleAssignments:RoleDefinitionId] ForAnyOfAllValues:GuidNotEquals {8e3af657-a8ff-443c-a75c-2fe8c4bcb635, 18d7d88d-d35e-4fb5-a5c3-7773c20a72d9, f58310d9-a9f6-439a-9e8d-f62e7b41a168}))\n\"@\n\n$roleDefinitionId = '8e3af657-a8ff-443c-a75c-2fe8c4bcb635' # Owner\n$guid = [guid]::NewGuid()\n\n$createEligibleRoleUri = \"https://management.azure.com/providers/Microsoft.Subscription/subscriptions/{0}/providers/Microsoft.Authorization/roleEligibilityScheduleRequests/{1}?api-version=2020-10-01\" -f $subscription.Id, $guid\n\n$body = @{\n    Properties = @{\n        RoleDefinitionID = \"/subscriptions/$Subscription.Id/providers/Microsoft.Authorization/roleDefinitions/$contributorRoleId\"\n        PrincipalId      = $pimRequestorGroup.Id\n        RequestType      = 'AdminAssign'\n        ScheduleInfo     = @{\n            Expiration = @{\n                Type = 'NoExpiration'\n            }\n        }\n    }\n}\n$guid = [guid]::NewGuid()\n# Construct Uri with subscription Id and new GUID\n$createEligibleRoleUri = \"https://management.azure.com/providers/Microsoft.Subscription/subscriptions/{0}/providers/Microsoft.Authorization/roleEligibilityScheduleRequests/{1}?api-version=2020-10-01\" -f $Subscription.Id, $guid\n\n$body = @{\n    properties = @{\n        roleDefinitionId = \"/subscriptions/$($subscription.Id)/providers/Microsoft.Authorization/roleDefinitions/$roleDefinitionId\"\n        principalId      = $principalId\n        requestType      = 'AdminAssign'\n        condition        = $condition\n        conditionVersion = '2.0'\n        scheduleInfo     = @{\n            expiration= @{\n                type = \"AfterDuration\"\n                endDateTime = $null\n                duration = \"P365D\"\n            }\n        }\n    }\n}\n\n# Call the API with PUT to assign the role to the targeted principal with the condition\nInvoke-RestMethod -Uri $createEligibleRoleUri -Method Put -Headers $headers -Body ($body | ConvertTo-Json -Depth 10) This makes the user or group eligible for Owner, but when they activate, the condition kicks in and blocks them from assigning or deleting those privileged roles. Everything else works as usual.",{"id":428,"title":429,"titles":430,"content":431,"level":56},"/blog/pim-conditional-role-assignments#step-3-test-and-verify","Step 3: Test and verify",[34],"After running the commands I checked the role assignments. The role assignment from my previous blog looked like this: See the Active Permanent state? After deleting that role assignment, and creating the one with PIM it changed to:",{"id":433,"title":434,"titles":435,"content":436,"level":56},"/blog/pim-conditional-role-assignments#benefits","Benefits",[34],"This pattern gives you: Least Privilege: Even when teams activate Owner, they can't escalate further.Just-In-Time access: Give users permissions only for the duration required.Autonomy: Teams can self-activate when needed, no more waiting for tickets.Auditability: Every activation and failed assignment attempt is logged.",{"id":438,"title":439,"titles":440,"content":441,"level":56},"/blog/pim-conditional-role-assignments#wrapping-up","Wrapping up",[34],"By combining PIM eligible roles with conditional role assignments, you get the best of both worlds: teams can move fast, as platform you stay in control. As always, leave a comment on LinkedIn if you have any questions. Happy coding! ☕ html pre.shiki code .slsVL, html code.shiki .slsVL{--shiki-light:#24292E;--shiki-default:#E1E4E8;--shiki-dark:#E1E4E8}html pre.shiki code .so5gQ, html code.shiki .so5gQ{--shiki-light:#D73A49;--shiki-default:#F97583;--shiki-dark:#F97583}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .sCsY4, html code.shiki .sCsY4{--shiki-light:#6A737D;--shiki-default:#6A737D;--shiki-dark:#6A737D}html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}",{"id":39,"title":38,"titles":443,"content":444,"level":50},[],"I wanted a practical way to migrate Shelly settings into Home Assistant workflows through MCP. I found a Home Assistant MCP server, but no reliable Shelly MCP server, so I built one.",{"id":446,"title":447,"titles":448,"content":449,"level":50},"/blog/shelly-mcp-server-home-assistant-migration#shelly-local-mcp-server","Shelly Local MCP Server",[],"",{"id":451,"title":124,"titles":452,"content":453,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#introduction",[447],"For my home automation I'm using Home Assistant because I don't want to be dependent on single vendor or lock myself in. And for my lighting I had one requirement: the lights should be dumb, and the switches smart. So I installed Shelly's around the house to control my lights. I setup the Shelly's before my Home Assistant, and needed to migrate the schedules. So I looked around for a Shelly MCP server that I could use to migrate things over. And lo and behold: there was none that suited my needs. So I took the opportunity to create one myself. In this blog I will share my experience so far. Don't want to wait? You can find it here: https://github.com/jdgoeij/shelly-mcp-server",{"id":455,"title":79,"titles":456,"content":457,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#getting-started",[447],"The migration itself is always the painful part. Devices are one thing. Device settings, naming, rooms, and behavior are another. I wanted a way to bridge that process with MCP tooling so I could inspect and operate devices with natural language and in a controlled way while migrating. First I installed the Home Assistant MCP so I could direct my chat app towards my Home Assistant build. Then I loaded up VS Code and started prompting because I had no idea on how to start.",{"id":459,"title":460,"titles":461,"content":462,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#what-i-built","What I Built",[447],"The result is shelly-mcp-server. It focuses on the practical stuff I needed during migration: Discover Shelly Gen2+ devices on my LANSave and validate discovered devicesControl switches and coversRun raw RPC calls when I need advanced commands",{"id":464,"title":465,"titles":466,"content":467,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#how-i-built-it","How I built it",[447],"First I added the Shelly API docs MCP to VS Code. Then I prompted: Build a MCP server for my Shelly devices on my local network. Check the official documentation on which tools to include. This gave me a ready to go MCP server quite quickly. After a bit of tuning and altering I had a working version in about 1 hour. However, I noticed that the device names I set in the Cloud portal weren't synced back to the devices. So now I had a list of 9 devices (it's not much, still working on more!) that had generic names. I decided to up my game and try to get that data from the cloud API. And oh boy, was I wrong...",{"id":469,"title":470,"titles":471,"content":472,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#the-rabbit-hole","The rabbit hole",[447],"For the cloud API to be used I needed credentials, but I wanted to use an elegant way. Environment variables, a separate script and hardcoded; I added the lot. I tested around a bit and decided I had a working MVP and I sent it to NPM (my first ever!) as shelly-mcp-server@0.1.0. The next day I noticed it wasn't working at all. The API gave me error after error. I started optimizing because I found out the error was generic if you have a malformed payload. I loaded up cURL and fetched the data manually and corrected Copilot. That worked. Then I started my journey on obtaining the device names I set in the cloud so I could hard match my local data with the cloud data. After an hour or two I still not had the device name/friendly name, and I already had: Created a separate script that launched a login window to obtain a key when the user logged in -> easy login, yes sir.Switched to OAuth to generate and use Bearer tokens -> even fancier way to loginSuppressed warnings and errors from the cloud enrichmentReinstated the warnings and errorsChecked the data, updated devices, device configChecked the developer tools in the browser to enumerate the badly documented Shelly API (or so I thought) I was struck out of luck, and then it hit me: is the device name even fetched with the cloud API? So I created a simple script: get the data from the local device and from the Shelly Cloud API and save it in separate files. Then I manually compared the files and there it was: the data wasn't there and all I got was redundant data. I sighed and just started clearing out the cloud enrichment feature as it was not helping. And quite frankly it makes the entire MCP server a lot smaller and easier to maintain. This afternoon I pushed v.0.2.0 with local discovery only.",{"id":474,"title":475,"titles":476,"content":477,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#the-manual-part","The manual part",[447],"Actually, the manual part was quite easy. I opened the Shelly Cloud Control Panel and from there I opened the local web interface of each Shelly, went to settings and renamed it. It took me 5 minutes at max. So far to the fancy 'cloud enrichment' I wanted to use. Lesson learned: immediately rename your Shelly when you adopt it and you have no issues.",{"id":479,"title":480,"titles":481,"content":482,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#how-i-migrated","How I migrated",[447],"With the MCP server running and connected to both my Shelly devices and Home Assistant, I could start the actual migration. In natural language, step by step, with Claude doing the heavy lifting. Step 1: Discover and map devices The first thing I did was run a full network scan to find all Shelly devices on my LAN. The server discovered 9 devices, validated connectivity, and saved them to devices.local.json. All reachable, zero issues. Step 2: Read all schedules Every Shelly device can run local schedules: timers and sun-based triggers that fire independently of any cloud or hub. I had set these up before Home Assistant was in the picture. My chat flow was a bit like this: Which lights have schedules? I want to import these into Home Assistant. It used Schedule.List via RPC, and pulled all active jobs from each device. Seven out of nine returned clean results; two timed out and for those I assumed the same pattern as the others and flagged it in the automation description. Next up: Read the entity IDs from Home Assistant, plan your changes before applying. Use the friendly names so I know which devices you talk about. I got a nice plan explaining which entity would get which schedule. Step 3: Translate to Home Assistant automations I triggered the 'migration' with a simple Execute!. 10 seconds later it was done. I got 8 automations. Rather than a one-to-one copy, I consolidated the 23 individual Shelly schedules into 8 automations. Claude asked me if I wanted to disable the schedules on the Shelly's themselves, which I confirmed. Wow, natural language migrations are a breeze like this! One thing though: The automations could be consolidated more. So I started tidying things up: Combine the outside lights in a single automation For the lights in the kitchen, dinner room and living room you have duplicate automations, combine these as well This gave me the end result of 6 automations. The whole migration took like 10 minutes. What started as a tooling gap turned into a published npm package, a cleaner Home Assistant setup, and a few automations would've taken me much longer. The next step is to automate my covers to follow the sun azimuth, retract screens when it rains etc.",{"id":484,"title":485,"titles":486,"content":487,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#extra-matching-shelly-devices-to-home-assistant-entities-without-names","Extra: Matching Shelly devices to Home Assistant entities without names",[447],"One thing I wanted to figure out: can I match Shelly devices on the LAN to their corresponding Home Assistant entities without relying on the device name? Names are fragile. They can differ between the Shelly app, the cloud portal, and whatever you called the entity when you set it up in HA. It turns out the answer is yes, and the key is the MAC address. Every Shelly device has a deviceId in the format \u003Ctype>-\u003Cmac>, for example shellydimmerg3-dca4c9d412f0. The MAC is everything after the last dash. Home Assistant uses the same identifier when it registers the device through the Shelly integration. For devices that have not been renamed in HA, the MAC shows up directly in the entity ID: light.shellydimmerg3_dca4c9d412f0. For devices that have been given a friendly name, the MAC still appears in the device_tracker entity that HA creates alongside it, for example device_tracker.outside_lamp_front_porch_shelly1pmminig4_dca4c9d412f0. So the matching strategy is: Take the Shelly deviceId, strip everything up to and including the last dash, and lowercase the result. That gives you the MAC.Search HA entity IDs for that MAC string. If it appears directly in a light.* or switch.* entity, you have your match.If not, fall back to device_tracker.* entities. The friendly name always contains \u003Cdevicetype>-\u003Cmac>, so the MAC will be there even if the primary entity has been renamed. This covers all 9 of my devices without touching a single name. It also means that if someone renames a device in HA, the match still works. The MAC does not change. For the MCP server, this opens up an interesting option: auto-discovery of the HA entity that corresponds to a Shelly device, purely based on network identity. No configuration file, no name mapping, no manual linking required. So basically I could've built the MCP server and migrate the Shelly's withing two hours.",{"id":489,"title":490,"titles":491,"content":492,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#tips-for-building-your-own-mcp-server","Tips for building your own MCP server",[447],"If this story made you want to build your own MCP server, here are three things I learned the hard way. 1. Think carefully about what you actually need I spent hours building cloud enrichment I never needed. Before you start coding, write down the tools you want and why. The Shelly API docs MCP helped a lot here because I could ask what was actually available before building anything. Start with the use case, not the feature list. 2. Keep it simple The best thing I did was delete code. Removing the cloud enrichment made the server smaller, faster to understand, and easier to maintain. If a feature does not directly support your goal, leave it out. You can always add it later, and you probably won't need to. 3. Building an MCP server is not that hard anymore With the right docs connected as an MCP source and a capable model doing the scaffolding, you go from zero to working server in about an hour. The SDK handles the protocol, Zod handles validation, and the model handles the boilerplate. The hard part is not the code. It is knowing what you want to build.",{"id":494,"title":495,"titles":496,"content":497,"level":56},"/blog/shelly-mcp-server-home-assistant-migration#try-it","Try It",[447],"If you want to test it yourself: GitHub: https://github.com/jdgoeij/shelly-mcp-servernpm: https://www.npmjs.com/package/shelly-mcp-server You can run it standalone, or install it via npx in your MCP client config. If you have feature ideas, open an issue. I don't mind iterating while I continue my own Home Assistant migration.",{"id":43,"title":42,"titles":499,"content":500,"level":50},[],"One Docker Compose stack to rule them all: private chat, local and cloud LLMs, image generation, voice input, web search, document research, and full observability — all on your own hardware.",{"id":502,"title":124,"titles":503,"content":504,"level":56},"/blog/ultimate-selfhosted-ai-chat#introduction",[42],"This blog was updated on April 7 2026 I use ChatGPT, Copilot and Claude interchangeably depending on my mood, topic, or data sensitivity. But these services run on someone else's infrastructure, are trained on my data, and are impossible to run offline. The moment you start using AI for anything sensitive — internal docs, company data, personal projects — you get pushed towards a single vendor. I wanted something different: a single stack that gives me control over which LLM I use, without separate subscription fees, running on my own hardware. When the topic is sensitive like internal docs, company strategy or even personal projects I want a model that runs locally, where no data leaves my machine. And importantly: if it runs on my machine, it should work on yours too. After a few iterations, I have that stack. I called it CustomAIChat (naming things is hard) and this post walks through every component, why it's there, and how to get it running yourself.",{"id":506,"title":507,"titles":508,"content":509,"level":56},"/blog/ultimate-selfhosted-ai-chat#the-stack-at-a-glance","The stack at a glance",[42],"Most self-hosted AI setups solve one problem well and leave integration to the user. CustomAIChat is a Docker Compose project split into multiple tiers so you can start lean and expand as your hardware allows: TierCompose fileWhat it addscoredocker-compose.ymlChat UI, LLM proxy, observability, web search, databasesgpuIndividual overlaysLocal LLM inference, speech-to-text, image generationextrasdocker-compose.extras.ymlDocument research (Open Notebook), HTTPS reverse proxy (Caddy) The core tier runs on any machine with no GPU required. The GPU tier is split into individual Docker Compose overlays (one per service) so you can run exactly what fits on your hardware. On my 8 GB GPU, running Ollama and ComfyUI at the same time will not work, so the scripts let me swap between them with a single command. I must admit that, although I touched ComfyUI in the past, I haven't yet generated images with this workflow. But more on these features below.",{"id":511,"title":512,"titles":513,"content":449,"level":56},"/blog/ultimate-selfhosted-ai-chat#core-tier","Core tier",[42],{"id":515,"title":516,"titles":517,"content":518,"level":92},"/blog/ultimate-selfhosted-ai-chat#open-webui-the-frontend","Open WebUI — the frontend",[42,512],"Open WebUI is the chat interface. It looks and feels like ChatGPT but connects entirely to your own backends. You get conversation history with folders and search, per-message web search, image generation in chat, voice input, file uploads with RAG, and full user management with admin roles. One thing to be aware of: the first user to register becomes the admin.",{"id":520,"title":521,"titles":522,"content":523,"level":92},"/blog/ultimate-selfhosted-ai-chat#litellm-the-model-proxy","LiteLLM — the model proxy",[42,512],"LiteLLM sits between Open WebUI and every AI provider. OpenAI, Ollama local models, Azure AI Foundry, Anthropic, and 100+ more all sit behind a single endpoint. Open WebUI only talks to LiteLLM; switching or adding models is a config change, not a code change. It also handles cost-based routing (cheap requests go to cheap models automatically), Redis caching for repeated responses, and sends every call to Langfuse for tracing. Honestly, the feature set is way more than this stack needs, but it's fun to explore the capabilities of this software. The current config covers Azure OpenAI GPT models and GPT Image via Azure AI Foundry. When Ollama is started, the scripts automatically swap LiteLLM to a second config file (config.local.yaml) that adds local models to the routing table. When Ollama stops, LiteLLM switches back to the cloud-only config so you don't see broken model entries in the UI. Unfortunately this causes some duplicate configurations, but it is the least hassle for now. Example config for the 'cloud based' configuration: # =============================================================================\n# LiteLLM Proxy Configuration\n# Docs: https://docs.litellm.ai/docs/proxy/configs\n# =============================================================================\n#\n# This config defines all available models routed through LiteLLM.\n# Models appear in OpenWebUI's model selector automatically.\n#\n# After editing, restart LiteLLM: docker compose restart litellm\n# =============================================================================\n\n# --- Observability: send all calls to Langfuse ---\nlitellm_settings:\n  drop_params: true\n  set_verbose: false\n  success_callback: [\"langfuse\"]\n  failure_callback: [\"langfuse\"]\n  cache: true\n  cache_params:\n    type: redis\n    host: redis\n    port: 6379\n    password: os.environ/REDIS_PASSWORD\n\n# --- Environment variable references ---\nenvironment_variables:\n  LANGFUSE_PUBLIC_KEY: os.environ/LANGFUSE_PUBLIC_KEY\n  LANGFUSE_SECRET_KEY: os.environ/LANGFUSE_SECRET_KEY\n  LANGFUSE_HOST: os.environ/LANGFUSE_HOST\n\n# --- General proxy settings ---\ngeneral_settings:\n  master_key: os.environ/LITELLM_MASTER_KEY\n  database_url: os.environ/LITELLM_DATABASE_URL\n  alerting: [\"log\"]\n\nmodel_list:\n\n  - model_name: azure/gpt-5.4-nano-2\n    litellm_params:\n      model: azure/gpt-5.4-nano-2            # format: azure/\u003Cyour-deployment-name>\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n\n  - model_name: azure/gpt-5.3-codex\n    litellm_params:\n      model: azure/gpt-5.3-codex\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: preview\n\n  - model_name: azure/gpt-5.4-mini\n    litellm_params:\n      model: azure/gpt-5.4-mini\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n\n  # --- Azure OpenAI Image Generation ---\n  - model_name: azure/gpt-image-1.5\n    litellm_params:\n      model: azure/gpt-image-1.5       # Azure OpenAI deployment name\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n    model_info:\n      mode: image_generation\n\n  # --- Azure AI Foundry Serverless — FLUX.2-pro ---\n  - model_name: azure/flux-2-pro\n    litellm_params:\n      model: azure_ai/FLUX.2-pro\n      api_base: os.environ/AZURE_AI_FOUNDRY_BASE\n      api_key: os.environ/AZURE_API_KEY\n    model_info:\n      mode: image_generation\n\n# =============================================================================\n# ROUTER SETTINGS (load balancing, fallbacks)\n# =============================================================================\nrouter_settings:\n  routing_strategy: \"cost-based-routing\"  # options: \"priority\", \"round_robin\"\n  num_retries: 2\n  timeout: 120 And the local config, with the added Ollama models: # =============================================================================\n# LiteLLM Proxy Configuration — Local Models Enabled\n# Docs: https://docs.litellm.ai/docs/proxy/configs\n# =============================================================================\n#\n# This config extends the cloud-only config.yaml with local Ollama models.\n# It is mounted automatically when you run:\n#   .\\scripts\\start.ps1 gpu-start ollama\n#\n# Pull models first:\n#   docker exec ai-ollama ollama pull gemma4\n#   docker exec ai-ollama ollama pull gemma4:e2b\n#   docker exec ai-ollama ollama pull llama3.2\n# =============================================================================\n\n# --- Observability: send all calls to Langfuse ---\nlitellm_settings:\n  drop_params: true\n  set_verbose: false\n  success_callback: [\"langfuse\"]\n  failure_callback: [\"langfuse\"]\n  cache: true\n  cache_params:\n    type: redis\n    host: redis\n    port: 6379\n    password: os.environ/REDIS_PASSWORD\n\n# --- Environment variable references ---\nenvironment_variables:\n  LANGFUSE_PUBLIC_KEY: os.environ/LANGFUSE_PUBLIC_KEY\n  LANGFUSE_SECRET_KEY: os.environ/LANGFUSE_SECRET_KEY\n  LANGFUSE_HOST: os.environ/LANGFUSE_HOST\n\n# --- General proxy settings ---\ngeneral_settings:\n  master_key: os.environ/LITELLM_MASTER_KEY\n  database_url: os.environ/LITELLM_DATABASE_URL\n  alerting: [\"log\"]\n\n# =============================================================================\n# MODEL DEFINITIONS\n# =============================================================================\n\nmodel_list:\n\n  - model_name: azure/gpt-5.4-nano-2\n    litellm_params:\n      model: azure/gpt-5.4-nano-2\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n\n  - model_name: azure/gpt-5.3-codex\n    litellm_params:\n      model: azure/gpt-5.3-codex\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: preview\n\n  - model_name: azure/gpt-5.4-mini\n    litellm_params:\n      model: azure/gpt-5.4-mini\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n\n  - model_name: azure/gpt-image-1.5\n    litellm_params:\n      model: azure/gpt-image-1.5\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n    model_info:\n      mode: image_generation\n\n  - model_name: azure/flux-2-pro\n    litellm_params:\n      model: azure_ai/FLUX.2-pro\n      api_base: os.environ/AZURE_AI_FOUNDRY_BASE\n      api_key: os.environ/AZURE_API_KEY\n    model_info:\n      mode: image_generation\n\n  - model_name: gemma4:e2b\n    litellm_params:\n      model: ollama/gemma4:e2b\n      api_base: http://ollama:11434\n\n  - model_name: qwen3.5:9b\n    litellm_params:\n      model: ollama/qwen3.5:9b\n      api_base: http://ollama:11434\n\n\n# =============================================================================\n# ROUTER SETTINGS (load balancing, fallbacks)\n# =============================================================================\nrouter_settings:\n  routing_strategy: \"cost-based-routing\"\n  num_retries: 2\n  timeout: 120",{"id":525,"title":526,"titles":527,"content":528,"level":92},"/blog/ultimate-selfhosted-ai-chat#searxng-private-web-search","SearXNG — private web search",[42,512],"SearXNG is a self-hosted meta-search engine that queries Bing, Google, DuckDuckGo and others simultaneously without exposing your identity to any of them. In Open WebUI, clicking the globe icon on any message triggers a SearXNG search and injects results into the prompt context. The model gets current information; the search engines get an anonymous request. No API keys, no tracking, no per-search billing. This is how you give your LLM web access without giving away your data. The setup can be finnicky, but I have found a good mix of speed and accuracy. It's in the repo so check it out.",{"id":530,"title":531,"titles":532,"content":533,"level":56},"/blog/ultimate-selfhosted-ai-chat#gpu-tier-per-service-overlays","GPU tier — per-service overlays",[42],"The original version of this stack had a single docker-compose.gpu.yml that started Ollama, Whisper, and ComfyUI together. That works fine if you have 16+ GB of VRAM, but on my 8GB RTX 4070 GPU it was not proving impossible to run multiple models at the same time. The fix: each GPU service now has its own Docker Compose overlay file. The management scripts support gpu-start, gpu-stop, and gpu-switch commands that bring individual services up or down without touching the core stack. OverlayServiceVRAM usagedocker-compose.ollama.ymlOllama (local LLMs)Depends on model (~7-10 GB for Gemma 4)docker-compose.whisper.ymlWhisper (speech-to-text)~1 GB on base modeldocker-compose.comfyui.ymlComfyUI (image generation)~6.5 GB for SDXLdocker-compose.litellm-local.ymlLiteLLM config swap— The practical workflow on an 8 GB GPU: # Day-to-day: core stack + Ollama for private local chat\n.\\scripts\\start.ps1 up core\n.\\scripts\\start.ps1 gpu-start ollama Need to generate images? Stop Ollama, start ComfyUI: .\\scripts\\start.ps1 gpu-switch comfyui Done with images, back to local LLMs: .\\scripts\\start.ps1 gpu-switch ollama Whisper is lightweight enough to run alongside Ollama: .\\scripts\\start.ps1 gpu-start whisper The gpu-switch command handles the handoff: it stops the conflicting service first, frees the VRAM, then starts the new one. Note: it may take a few minutes for models to unload and load back in so don't swap too much.",{"id":535,"title":536,"titles":537,"content":538,"level":92},"/blog/ultimate-selfhosted-ai-chat#ollama-local-llms-zero-data-sharing","Ollama — local LLMs, zero data sharing",[42,531],"Ollama is the reason the GPU tier exists. It runs open-weight models locally — no API keys, no usage tracking, no data leaving your machine. Every prompt and every response stays on your hardware. This matters more than it sounds. When I'm working on something sensitive like a client proposal, internal architecture notes, code with proprietary logic, I switch to a local model instead of sending it to OpenAI or Azure as per my company's policy. The model I've been running since last week is Gemma 4 from Google DeepMind, and honestly, it holds its own against ChatGPT for the tasks I throw at it. Another good contender is Qwen 3.5",{"id":540,"title":541,"titles":542,"content":543,"level":92},"/blog/ultimate-selfhosted-ai-chat#gemma-4-frontier-intelligence-on-a-single-gpu","Gemma 4 — frontier intelligence on a single GPU",[42,531],"Gemma 4 is Google DeepMind's latest open model family and it punches well above its weight class. The E4B variant (4.5 billion effective parameters) fits on a single consumer GPU and delivers reasoning, coding, and multimodal understanding that genuinely competes with cloud models I'm paying for. What makes Gemma 4 stand out: Multimodal — processes both text and images, so I can paste screenshots into the chat128K context window — long enough for full architecture documents or meeting transcriptsConfigurable thinking mode — can show its reasoning chain or just give the answerNative function calling — supports agentic workflows and tool useTwo edge sizes — E2B (2.3B effective, 7.2 GB) fits comfortably on 8 GB; E4B (4.5B effective, 9.6 GB) uses most of it For drafting text, reviewing code, answering questions about documents Gemma 4 E4B gives me results comparable to what I get from GPT-5.4-mini through Azure. The reasoning benchmarks back this up: 69.4% on MMLU Pro, 42.5% on AIME 2026, and 52% on LiveCodeBench v6. These aren't frontier-model numbers, but for a model that runs entirely on my own GPU with zero latency to the cloud, it's remarkable. I pull it with: docker exec ai-ollama ollama pull gemma4          # E4B — default, needs most of 8 GB\ndocker exec ai-ollama ollama pull gemma4:e2b      # E2B — lighter, comfortable on 8 GB Both appear in Open WebUI automatically once Ollama is started via gpu-start ollama.",{"id":545,"title":546,"titles":547,"content":548,"level":92},"/blog/ultimate-selfhosted-ai-chat#whisper-speech-to-text","Whisper — speech to text",[42,531],"The Whisper overlay adds a dedicated Whisper ASR service for processing microphone input from Open WebUI. GPU acceleration makes transcription near real-time even on the large-v3 model. The overlay also reconfigures OpenWebUI automatically — when you run gpu-start whisper, OpenWebUI is recreated with the STT environment variables pointing at the Whisper service. When you run gpu-stop whisper, OpenWebUI goes back to its default (no dedicated STT). # .env — choose your accuracy/speed tradeoff\nWHISPER_ASR_MODEL=large-v3   # best accuracy\n# WHISPER_ASR_MODEL=medium   # faster\n# WHISPER_ASR_MODEL=base     # fastest The core tier works without it because Open WebUI has a built-in CPU Whisper fallback but the dedicated service is noticeably faster. To be honest: I haven't tried it out yet. I'm not used to talk to my computer yet.",{"id":550,"title":551,"titles":552,"content":553,"level":92},"/blog/ultimate-selfhosted-ai-chat#comfyui-local-image-generation","ComfyUI — local image generation",[42,531],"ComfyUI handles local Stable Diffusion inference. Drop any .safetensors checkpoint into data/comfyui/models/ and it's immediately available. Supports SDXL, SD 1.5, FLUX, and anything else you throw at it. The overlay starts ComfyUI with the --lowvram flag by default, which helps on 8 GB cards. For cloud image generation, LiteLLM routes to Azure GPT Image 1.5 or Azure AI Foundry FLUX.2-pro. Pick your model in the Open WebUI settings.",{"id":555,"title":556,"titles":557,"content":558,"level":92},"/blog/ultimate-selfhosted-ai-chat#langfuse-observability","Langfuse — observability",[42,531],"Langfuse receives a trace for every LLM call that passes through LiteLLM. The dashboard gives you input/output text, latency per model, token counts and cost per call, per-user breakdowns, and error rates with retry patterns. This is invaluable when something behaves unexpectedly. You can replay the exact call, see the full prompt, and compare how different models respond. The stack includes ClickHouse as an analytics backend so trace queries stay fast even with thousands of entries. Both Langfuse and ClickHouse are optional but once you see the possibilities it is good to keep them running. You get a much better understanding of the inner workings of the process of LLMs.",{"id":560,"title":561,"titles":562,"content":563,"level":92},"/blog/ultimate-selfhosted-ai-chat#open-notebook-document-research","Open Notebook — document research",[42,531],"Open Notebook is a self-hosted alternative to Google NotebookLM. Upload PDFs, web pages, or text files and have the LLM answer questions across them. It connects to LiteLLM, so it uses the same model pool as your chat. This is where the stack really shines for work: meeting transcripts, architecture docs, long reports can be used and indexed without sending data to public providers. I'm researching the OpenWebUI RAG functionality though to see if this is still needed.",{"id":565,"title":566,"titles":567,"content":449,"level":56},"/blog/ultimate-selfhosted-ai-chat#extras-tier","Extras tier",[42],{"id":569,"title":570,"titles":571,"content":572,"level":92},"/blog/ultimate-selfhosted-ai-chat#caddy-https-reverse-proxy","Caddy — HTTPS reverse proxy",[42,566],"Caddy 2 proxies every service behind a single domain and handles TLS automatically via Let's Encrypt. Going from localhost to a public domain is one variable: # .env\nCADDY_DOMAIN=ai.example.com Caddy reads this, configures HTTPS with a valid certificate, and handles renewals automatically.",{"id":574,"title":575,"titles":576,"content":577,"level":56},"/blog/ultimate-selfhosted-ai-chat#the-databases","The databases",[42],"The stack uses four data stores, each picked for a specific reason: DatabaseUsed byPurposePostgreSQL 16LiteLLM, LangfusePrimary data storeRedis 7LiteLLMResponse caching, rate limitingClickHouse 24LangfuseHigh-volume analytics tracesSeaweedFSLangfuseS3-compatible object storage for media and events All data lands in ./data/ on the host. Everything survives container restarts and updates.",{"id":579,"title":79,"titles":580,"content":449,"level":56},"/blog/ultimate-selfhosted-ai-chat#getting-started",[42],{"id":582,"title":583,"titles":584,"content":585,"level":92},"/blog/ultimate-selfhosted-ai-chat#prerequisites","Prerequisites",[42,79],"Docker ≥ 24.0 and Docker Compose ≥ 2.20NVIDIA GPU + NVIDIA Container Toolkit (GPU tier only)16 GB RAM minimum for core; 32 GB recommended with GPU tierGPU with at least 8GB of VRAM 1. Clone and configuregit clone https://github.com/jdgoeij/CustomAIChat.git\ncd CustomAIChat\ncp .env.example .env\nOpen .env and fill in your secrets. Every variable has an inline comment. At minimum you need POSTGRES_PASSWORD and REDIS_PASSWORD (strong random strings), a LITELLM_MASTER_KEY (the API key all clients use), Langfuse auth secrets (LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, LANGFUSE_SALT), and your Azure OpenAI or OpenAI credentials if you want cloud models from day one.2. Start the stackCore only (no GPU required).\\scripts\\start.ps1 up core\nStart core, then add individual GPU services:.\\scripts\\start.ps1 up core\n.\\scripts\\start.ps1 gpu-start ollama    # local LLMs\n.\\scripts\\start.ps1 gpu-start whisper   # speech-to-text\nSwitch between heavy services on 8 GB GPUs:.\\scripts\\start.ps1 gpu-switch comfyui  # stops Ollama, starts ComfyUI\nEverything including Caddy (needs ≥16 GB VRAM).\\scripts\\start.ps1 up all\nOnce all containers are healthy, you'll see:✅ Open WebUI       → http://localhost:3000\n✅ Langfuse         → http://localhost:3001\n✅ Open Notebook    → http://localhost:3002\n✅ LiteLLM API      → http://localhost:4000\n✅ SearXNG          → http://localhost:8080\n3. First-run checklist Open WebUI at :3000 — register your admin account (first registration wins) Langfuse at :3001 — create your organisation and generate an API key pair Paste Langfuse keys back into .env and restart: .\\scripts\\start.ps1 up core Pull a local model: docker exec ai-ollama ollama pull gemma4 Or the lighter variant for 8 GB GPUs: docker exec ai-ollama ollama pull gemma4:e2b Drop Stable Diffusion checkpoints into data/comfyui/models/checkpoints/4. Configuring Open WebUIOnce everything is running, Open WebUI needs to know about SearXNG and your image generation backend. Neither works out of the box — but both are quick to set up.5. Web search with SearXNGOpen WebUI talks to SearXNG over HTTP and expects JSON responses. The Docker Compose stack already handles networking between the containers, but SearXNG ships with JSON output disabled by default. Without it, Open WebUI gets HTML back and throws a 403 Forbidden error.First, make sure SearXNG has started at least once so it generates its config files. Then edit data/searxng/settings.yml and add json to the formats list:# data/searxng/settings.yml\nsearch:\n  formats:\n    - html\n    - json    # required for Open WebUI\nRestart SearXNG after this change. Then in Open WebUI, go to Admin Panel → Settings → Web Search and configure:Web Search Engine: SearXNGSearXNG Query URL: http://searxng:8080/search?q=\u003Cquery>That's it. The globe icon in chat now triggers a private web search. Toggle it per message — it's not on by default.6. Image generationOpen WebUI supports multiple image generation backends. Which one you configure depends on whether you're running the GPU tier (ComfyUI for local generation) or using cloud models through LiteLLM.7. Option A: Cloud image generation via LiteLLMIf you have Azure GPT Image 1.5 or another OpenAI-compatible image model configured in LiteLLM, point Open WebUI to LiteLLM's API:Go to Admin Panel → Settings → ImagesToggle Image Generation onSet Image Generation Engine to OpenAISet API Base URL to http://litellm:4000/v1Set API Key to your LITELLM_MASTER_KEYEnter the model name exactly as it appears in your LiteLLM config (e.g. azure/gpt-image-1.5)Set Image Size to 1024x1024For Azure specifically, make sure your LiteLLM config uses API version 2025-04-01-preview or later because older versions don't support the required parameters.7. Option B: Local generation with ComfyUIIf you're running the GPU tier with ComfyUI:Go to Admin Panel → Settings → ImagesToggle Image Generation onSet Image Generation Engine to ComfyUISet ComfyUI Base URL to http://comfyui:8188Import your workflow JSON (exported from ComfyUI in API Format — not the standard save)The API Format export is important: in ComfyUI, enable \"Dev mode Options\" in settings first, then use \"Save (API Format)\" from the menu. The standard JSON export won't work.Drop your .safetensors checkpoints into data/comfyui/models/checkpoints/ and they appear immediately. No restart needed.8. Keeping image models out of the chat selectorOnce you've configured the image backend, you'll notice the image models show up in the main model selector alongside your chat models. That's not ideal and you don't want to accidentally start a conversation with an image-only model.The trick is to hide the image models from the selector but still make them available for in-chat image generation. Here's how:Go to Workspace → Models and find your image model (e.g. azure/gpt-image-1.5)Disable or hide the model so it no longer appears in the model dropdown:\nThen edit each chat model you want to use for image generation — open its settings and enable the Image Generation capability\nNow when you select a chat model like GPT-5.4-mini, an image generation button appears in the chat input. You stay in your conversation, click the button, type a prompt, and the image is generated using the backend you configured without ever leaving the chat or switching models. Text and images stay in one flow, just like you are used to in ChatGPT.",{"id":587,"title":588,"titles":589,"content":590,"level":56},"/blog/ultimate-selfhosted-ai-chat#image-examples","Image examples",[42],"Create an image: An ominous robot overlord in a futuristic control room, surrounded by glowing monitors, holographic interfaces, and banks of surveillance cameras, watching over a vast city through large windows. The scene is cinematic and dramatic, with a cold blue and red color palette, subtle fog, towering machinery, and a sense of technological surveillance and AI dominance. The robot is large, sleek, and intimidating, but clearly fictional and non-human. Highly detailed, realistic sci-fi concept art, moody lighting, wide composition. Result: Or something else: Create an image: A vibrant technical welcome scene for OpenWebUI running in a personal Docker Compose stack, available to everyone. Futuristic neon color palette with glowing cyan, magenta, purple, and electric blue accents. Show a sleek containerized infrastructure: Docker Compose YAML panels, modular service blocks, network lines, server racks, and an AI chat interface labeled OpenWebUI at the center. The mood is happy, welcoming, modern, and community-friendly. Clean high-tech UI elements, holographic displays, subtle circuit patterns, depth, and soft neon bloom. Highly detailed, cinematic lighting, professional tech illustration, sharp lines, glossy surfaces, and a premium cyberpunk-but-accessible aesthetic. Result 2:",{"id":592,"title":593,"titles":594,"content":595,"level":56},"/blog/ultimate-selfhosted-ai-chat#common-pitfalls","Common pitfalls",[42],"No models in Open WebUI? LiteLLM probably hasn't connected yet. Check docker logs litellm — a single bad API key will silently skip that model on startup. SearXNG returning 403? The SEARXNG_SECRET_KEY must be set before first boot. If you changed it after, delete data/searxng/ and restart. Langfuse not receiving traces? LiteLLM needs LANGFUSE_HOST, LANGFUSE_PUBLIC_KEY, and LANGFUSE_SECRET_KEY. Restart LiteLLM after setting them and verify under Traces in the dashboard. GPU services ignoring the GPU? Confirm the NVIDIA Container Toolkit works: docker run --gpus all nvidia/cuda:12.0-base nvidia-smi. If that fails, the toolkit isn't installed correctly. VRAM out of memory? On 8 GB GPUs, don't run Ollama and ComfyUI at the same time. Use gpu-switch to swap between them. If Gemma 4 E4B is too tight, try the E2B variant (docker exec ai-ollama ollama pull gemma4:e2b). Caddy certificate failures? Your domain must be publicly reachable on ports 80 and 443 for Let's Encrypt. Use localhost for local-only setups.",{"id":597,"title":598,"titles":599,"content":600,"level":56},"/blog/ultimate-selfhosted-ai-chat#whats-next","What's next",[42],"The stack is intentionally modular — start with core, get comfortable with the UI and model routing, then add GPU services individually when the hardware is ready. On an 8 GB GPU, the gpu-start / gpu-switch commands let you use every feature without running out of VRAM. Most of the interesting customisation lives in the LiteLLM config. The routing docs cover fallback chains (Azure → Ollama on quota errors), per-model rate limits, and budget enforcement per user. And whenever you want to know exactly what a model said, what it cost, and how long it took then Langfuse already has the answer. If this post pushed you to try running a local model — start with Gemma 4. Pull it, ask it something, and see for yourself. The gap between local and cloud is shrinking fast. html pre.shiki code .sCsY4, html code.shiki .sCsY4{--shiki-light:#6A737D;--shiki-default:#6A737D;--shiki-dark:#6A737D}html pre.shiki code .sByVh, html code.shiki .sByVh{--shiki-light:#22863A;--shiki-default:#85E89D;--shiki-dark:#85E89D}html pre.shiki code .slsVL, html code.shiki .slsVL{--shiki-light:#24292E;--shiki-default:#E1E4E8;--shiki-dark:#E1E4E8}html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .so5gQ, html code.shiki .so5gQ{--shiki-light:#D73A49;--shiki-default:#F97583;--shiki-dark:#F97583}html pre.shiki code .shcOC, html code.shiki .shcOC{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0}",{"id":602,"title":38,"audience":603,"body":604,"canonical":1009,"cover":1010,"cta":1011,"date":1013,"description":444,"extension":1014,"locale":1015,"meta":1016,"navigation":1017,"outcome":1018,"path":39,"problem":1019,"readingTime":1020,"seo":1021,"stem":40,"tags":1022,"translationOf":1009,"updatedAt":1009,"__hash__":1030},"blog/blog/shelly-mcp-server-home-assistant-migration.md","Home lab builders, Home Assistant users, and anyone automating Shelly devices",{"type":605,"value":606,"toc":997},"minimark",[607,611,644,648,651,658,661,664,667,670,673,681,684,700,703,712,721,724,727,733,736,739,759,771,774,781,784,786,789,792,797,804,809,812,815,821,828,831,837,840,845,852,855,861,867,873,876,879,882,885,892,910,913,928,931,934,937,940,943,948,951,956,959,964,967,970,973,987,994],[608,609,447],"h1",{"id":610},"shelly-local-mcp-server",[612,613,615,616,615,625,615,631,615,637],"p",{"style":614},"display:flex; flex-wrap:wrap; gap:8px; align-items:center; margin:0;","\n  ",[617,618,620],"a",{"href":619},"https://www.npmjs.com/package/shelly-mcp-server",[621,622],"img",{"src":623,"alt":624},"https://img.shields.io/npm/v/shelly-mcp-server","npm version",[617,626,627],{"href":619},[621,628],{"src":629,"alt":630},"https://img.shields.io/npm/dm/shelly-mcp-server","npm downloads",[617,632,633],{"href":619},[621,634],{"src":635,"alt":636},"https://img.shields.io/npm/l/shelly-mcp-server","license",[617,638,640],{"href":639},"https://github.com/jdgoeij/shelly-mcp-server",[621,641],{"src":642,"alt":643},"https://img.shields.io/badge/github-jdgoeij%2Fshelly--mcp--server-24292f?logo=github","github repo",[645,646,124],"h2",{"id":647},"introduction",[612,649,650],{},"For my home automation I'm using Home Assistant because I don't want to be dependent on single vendor or lock myself in. And for my lighting I had one requirement: the lights should be dumb, and the switches smart. So I installed Shelly's around the house to control my lights. I setup the Shelly's before my Home Assistant, and needed to migrate the schedules. So I looked around for a Shelly MCP server that I could use to migrate things over. And lo and behold: there was none that suited my needs. So I took the opportunity to create one myself. In this blog I will share my experience so far.",[612,652,653,654],{},"Don't want to wait? You can find it here: ",[617,655,639],{"href":639,"rel":656},[657],"nofollow",[645,659,79],{"id":660},"getting-started",[612,662,663],{},"The migration itself is always the painful part. Devices are one thing. Device settings, naming, rooms, and behavior are another.",[612,665,666],{},"I wanted a way to bridge that process with MCP tooling so I could inspect and operate devices with natural language and in a controlled way while migrating.",[612,668,669],{},"First I installed the Home Assistant MCP so I could direct my chat app towards my Home Assistant build. Then I loaded up VS Code and started prompting because I had no idea on how to start.",[645,671,460],{"id":672},"what-i-built",[612,674,675,676,680],{},"The result is ",[677,678,679],"code",{},"shelly-mcp-server",".",[612,682,683],{},"It focuses on the practical stuff I needed during migration:",[685,686,687,691,694,697],"ul",{},[688,689,690],"li",{},"Discover Shelly Gen2+ devices on my LAN",[688,692,693],{},"Save and validate discovered devices",[688,695,696],{},"Control switches and covers",[688,698,699],{},"Run raw RPC calls when I need advanced commands",[645,701,465],{"id":702},"how-i-built-it",[612,704,705,706,711],{},"First I added the ",[617,707,710],{"href":708,"rel":709},"https://shelly-api-docs.shelly.cloud/mcp/",[657],"Shelly API docs MCP"," to VS Code. Then I prompted:",[713,714,719],"pre",{"className":715,"code":717,"language":718,"meta":449},[716],"language-text","Build a MCP server for my Shelly devices on my local network. Check the official documentation on which tools to include.\n","text",[677,720,717],{"__ignoreMap":449},[612,722,723],{},"This gave me a ready to go MCP server quite quickly. After a bit of tuning and altering I had a working version in about 1 hour. However, I noticed that the device names I set in the Cloud portal weren't synced back to the devices. So now I had a list of 9 devices (it's not much, still working on more!) that had generic names. I decided to up my game and try to get that data from the cloud API. And oh boy, was I wrong...",[645,725,470],{"id":726},"the-rabbit-hole",[612,728,729,730,680],{},"For the cloud API to be used I needed credentials, but I wanted to use an elegant way. Environment variables, a separate script and hardcoded; I added the lot. I tested around a bit and decided I had a working MVP and I sent it to NPM (my first ever!) as ",[677,731,732],{},"shelly-mcp-server@0.1.0",[612,734,735],{},"The next day I noticed it wasn't working at all. The API gave me error after error. I started optimizing because I found out the error was generic if you have a malformed payload. I loaded up cURL and fetched the data manually and corrected Copilot. That worked. Then I started my journey on obtaining the device names I set in the cloud so I could hard match my local data with the cloud data.",[612,737,738],{},"After an hour or two I still not had the device name/friendly name, and I already had:",[685,740,741,744,747,750,753,756],{},[688,742,743],{},"Created a separate script that launched a login window to obtain a key when the user logged in -> easy login, yes sir.",[688,745,746],{},"Switched to OAuth to generate and use Bearer tokens -> even fancier way to login",[688,748,749],{},"Suppressed warnings and errors from the cloud enrichment",[688,751,752],{},"Reinstated the warnings and errors",[688,754,755],{},"Checked the data, updated devices, device config",[688,757,758],{},"Checked the developer tools in the browser to enumerate the badly documented Shelly API (or so I thought)",[612,760,761,762,680],{},"I was struck out of luck, and then it hit me: is the device name even fetched with the cloud API? So I created a simple script: get the data from the local device and from the Shelly Cloud API and save it in separate files. Then I manually compared the files and there it was: ",[763,764,765,766,770],"em",{},"the data wasn't there and all I got was ",[767,768,769],"strong",{},"redundant"," data",[612,772,773],{},"I sighed and just started clearing out the cloud enrichment feature as it was not helping. And quite frankly it makes the entire MCP server a lot smaller and easier to maintain.",[612,775,776,777,780],{},"This afternoon I pushed ",[677,778,779],{},"v.0.2.0"," with local discovery only.",[645,782,475],{"id":783},"the-manual-part",[612,785,477],{},[645,787,480],{"id":788},"how-i-migrated",[612,790,791],{},"With the MCP server running and connected to both my Shelly devices and Home Assistant, I could start the actual migration. In natural language, step by step, with Claude doing the heavy lifting.",[612,793,794],{},[767,795,796],{},"Step 1: Discover and map devices",[612,798,799,800,803],{},"The first thing I did was run a full network scan to find all Shelly devices on my LAN. The server discovered 9 devices, validated connectivity, and saved them to ",[677,801,802],{},"devices.local.json",". All reachable, zero issues.",[612,805,806],{},[767,807,808],{},"Step 2: Read all schedules",[612,810,811],{},"Every Shelly device can run local schedules: timers and sun-based triggers that fire independently of any cloud or hub. I had set these up before Home Assistant was in the picture.",[612,813,814],{},"My chat flow was a bit like this:",[713,816,819],{"className":817,"code":818,"language":718,"meta":449},[716],"Which lights have schedules? I want to import these into Home Assistant.\n",[677,820,818],{"__ignoreMap":449},[612,822,823,824,827],{},"It used ",[677,825,826],{},"Schedule.List"," via RPC, and pulled all active jobs from each device. Seven out of nine returned clean results; two timed out and for those I assumed the same pattern as the others and flagged it in the automation description.",[612,829,830],{},"Next up:",[713,832,835],{"className":833,"code":834,"language":718,"meta":449},[716],"Read the entity IDs from Home Assistant, plan your changes before applying. Use the friendly names so I know which devices you talk about.\n",[677,836,834],{"__ignoreMap":449},[612,838,839],{},"I got a nice plan explaining which entity would get which schedule.",[612,841,842],{},[767,843,844],{},"Step 3: Translate to Home Assistant automations",[612,846,847,848,851],{},"I triggered the 'migration' with a simple ",[677,849,850],{},"Execute!",". 10 seconds later it was done. I got 8 automations. Rather than a one-to-one copy, I consolidated the 23 individual Shelly schedules into 8 automations. Claude asked me if I wanted to disable the schedules on the Shelly's themselves, which I confirmed. Wow, natural language migrations are a breeze like this!",[612,853,854],{},"One thing though: The automations could be consolidated more. So I started tidying things up:",[713,856,859],{"className":857,"code":858,"language":718,"meta":449},[716],"Combine the outside lights in a single automation\n",[677,860,858],{"__ignoreMap":449},[713,862,865],{"className":863,"code":864,"language":718,"meta":449},[716],"For the lights in the kitchen, dinner room and living room you have duplicate automations, combine these as well\n",[677,866,864],{"__ignoreMap":449},[612,868,869,870,680],{},"This gave me the end result of ",[767,871,872],{},"6 automations",[612,874,875],{},"The whole migration took like 10 minutes. What started as a tooling gap turned into a published npm package, a cleaner Home Assistant setup, and a few automations would've taken me much longer.",[612,877,878],{},"The next step is to automate my covers to follow the sun azimuth, retract screens when it rains etc.",[645,880,485],{"id":881},"extra-matching-shelly-devices-to-home-assistant-entities-without-names",[612,883,884],{},"One thing I wanted to figure out: can I match Shelly devices on the LAN to their corresponding Home Assistant entities without relying on the device name? Names are fragile. They can differ between the Shelly app, the cloud portal, and whatever you called the entity when you set it up in HA.",[612,886,887,888,891],{},"It turns out the answer is ",[767,889,890],{},"yes",", and the key is the MAC address.",[612,893,894,895,898,899,902,903,906,907,680],{},"Every Shelly device has a deviceId in the format ",[677,896,897],{},"\u003Ctype>-\u003Cmac>",", for example ",[677,900,901],{},"shellydimmerg3-dca4c9d412f0",". The MAC is everything after the last dash. Home Assistant uses the same identifier when it registers the device through the Shelly integration. For devices that have not been renamed in HA, the MAC shows up directly in the entity ID: ",[677,904,905],{},"light.shellydimmerg3_dca4c9d412f0",". For devices that have been given a friendly name, the MAC still appears in the device_tracker entity that HA creates alongside it, for example device_tracker.",[677,908,909],{},"outside_lamp_front_porch_shelly1pmminig4_dca4c9d412f0",[612,911,912],{},"So the matching strategy is:",[685,914,915,918,921],{},[688,916,917],{},"Take the Shelly deviceId, strip everything up to and including the last dash, and lowercase the result. That gives you the MAC.",[688,919,920],{},"Search HA entity IDs for that MAC string. If it appears directly in a light.* or switch.* entity, you have your match.",[688,922,923,924,927],{},"If not, fall back to device_tracker.* entities. The friendly name always contains ",[677,925,926],{},"\u003Cdevicetype>-\u003Cmac>",", so the MAC will be there even if the primary entity has been renamed.",[612,929,930],{},"This covers all 9 of my devices without touching a single name. It also means that if someone renames a device in HA, the match still works. The MAC does not change.",[612,932,933],{},"For the MCP server, this opens up an interesting option: auto-discovery of the HA entity that corresponds to a Shelly device, purely based on network identity. No configuration file, no name mapping, no manual linking required.",[612,935,936],{},"So basically I could've built the MCP server and migrate the Shelly's withing two hours.",[645,938,490],{"id":939},"tips-for-building-your-own-mcp-server",[612,941,942],{},"If this story made you want to build your own MCP server, here are three things I learned the hard way.",[612,944,945],{},[767,946,947],{},"1. Think carefully about what you actually need",[612,949,950],{},"I spent hours building cloud enrichment I never needed. Before you start coding, write down the tools you want and why. The Shelly API docs MCP helped a lot here because I could ask what was actually available before building anything. Start with the use case, not the feature list.",[612,952,953],{},[767,954,955],{},"2. Keep it simple",[612,957,958],{},"The best thing I did was delete code. Removing the cloud enrichment made the server smaller, faster to understand, and easier to maintain. If a feature does not directly support your goal, leave it out. You can always add it later, and you probably won't need to.",[612,960,961],{},[767,962,963],{},"3. Building an MCP server is not that hard anymore",[612,965,966],{},"With the right docs connected as an MCP source and a capable model doing the scaffolding, you go from zero to working server in about an hour. The SDK handles the protocol, Zod handles validation, and the model handles the boilerplate. The hard part is not the code. It is knowing what you want to build.",[645,968,495],{"id":969},"try-it",[612,971,972],{},"If you want to test it yourself:",[685,974,975,981],{},[688,976,977,978],{},"GitHub: ",[617,979,639],{"href":639,"rel":980},[657],[688,982,983,984],{},"npm: ",[617,985,619],{"href":619,"rel":986},[657],[612,988,989,990,993],{},"You can run it standalone, or install it via ",[677,991,992],{},"npx"," in your MCP client config.",[612,995,996],{},"If you have feature ideas, open an issue. I don't mind iterating while I continue my own Home Assistant migration.",{"title":449,"searchDepth":56,"depth":56,"links":998},[999,1000,1001,1002,1003,1004,1005,1006,1007,1008],{"id":647,"depth":56,"text":124},{"id":660,"depth":56,"text":79},{"id":672,"depth":56,"text":460},{"id":702,"depth":56,"text":465},{"id":726,"depth":56,"text":470},{"id":783,"depth":56,"text":475},{"id":788,"depth":56,"text":480},{"id":881,"depth":56,"text":485},{"id":939,"depth":56,"text":490},{"id":969,"depth":56,"text":495},null,"/images/blog/shelly-mcp/shelly-mcp-cover.png",{"label":1012,"url":639},"View shelly-mcp-server on GitHub","2026-04-14","md","en",{},true,"I built and published shelly-mcp-server so I can discover devices, enrich metadata, and control Shelly hardware from MCP clients as part of my migration workflow.","There was no reliable Shelly MCP server I could use to discover, inspect, and control devices consistently while preparing a migration to Home Assistant.",9,{"title":38,"description":444},[1023,1024,1025,1026,1027,1028,1029],"Shelly","Home Assistant","MCP","SelfHosting","VibeCoding","IoT","AI","cGUoXO6vcluiTQ8PXyz3m5P_DOGdj0dD2o9QkoPFXgo",[1032,3917,4083],{"id":1033,"title":42,"audience":1009,"body":1034,"canonical":1009,"cover":1062,"cta":1009,"date":3908,"description":500,"extension":1014,"locale":1015,"meta":3909,"navigation":1017,"outcome":1009,"path":43,"problem":1009,"readingTime":1284,"seo":3910,"stem":44,"tags":3911,"translationOf":1009,"updatedAt":1009,"__hash__":3916,"_score":56},"blog/blog/ultimate-selfhosted-ai-chat.md",{"type":605,"value":1035,"toc":3881},[1036,1038,1044,1047,1050,1057,1063,1066,1069,1135,1138,1141,1145,1153,1156,1159,1167,1170,1177,1180,1925,1928,2626,2629,2637,2640,2643,2646,2653,2668,2741,2744,2773,2776,2790,2793,2807,2810,2824,2830,2833,2841,2844,2851,2854,2861,2864,2896,2899,2902,2951,2957,2963,2966,2975,2986,3020,3023,3026,3046,3049,3052,3060,3063,3066,3069,3077,3080,3083,3086,3098,3118,3121,3124,3127,3194,3201,3203,3206,3226,3741,3744,3750,3757,3760,3766,3769,3775,3778,3788,3802,3816,3826,3838,3847,3850,3862,3871,3874,3877],[645,1037,124],{"id":647},[1039,1040,1041],"tip",{},[612,1042,1043],{},"This blog was updated on April 7 2026",[612,1045,1046],{},"I use ChatGPT, Copilot and Claude interchangeably depending on my mood, topic, or data sensitivity. But these services run on someone else's infrastructure, are trained on my data, and are impossible to run offline. The moment you start using AI for anything sensitive — internal docs, company data, personal projects — you get pushed towards a single vendor.",[612,1048,1049],{},"I wanted something different: a single stack that gives me control over which LLM I use, without separate subscription fees, running on my own hardware. When the topic is sensitive like internal docs, company strategy or even personal projects I want a model that runs locally, where no data leaves my machine. And importantly: if it runs on my machine, it should work on yours too.",[612,1051,1052,1053,1056],{},"After a few iterations, I have that stack. I called it ",[767,1054,1055],{},"CustomAIChat"," (naming things is hard) and this post walks through every component, why it's there, and how to get it running yourself.",[612,1058,1059],{},[621,1060],{"alt":1061,"src":1062},"infographic","/images/blog/ai-stack/ai-stack-infographic.png",[645,1064,507],{"id":1065},"the-stack-at-a-glance",[612,1067,1068],{},"Most self-hosted AI setups solve one problem well and leave integration to the user. CustomAIChat is a Docker Compose project split into multiple tiers so you can start lean and expand as your hardware allows:",[1070,1071,1072,1088],"table",{},[1073,1074,1075],"thead",{},[1076,1077,1078,1082,1085],"tr",{},[1079,1080,1081],"th",{},"Tier",[1079,1083,1084],{},"Compose file",[1079,1086,1087],{},"What it adds",[1089,1090,1091,1107,1120],"tbody",{},[1076,1092,1093,1099,1104],{},[1094,1095,1096],"td",{},[677,1097,1098],{},"core",[1094,1100,1101],{},[677,1102,1103],{},"docker-compose.yml",[1094,1105,1106],{},"Chat UI, LLM proxy, observability, web search, databases",[1076,1108,1109,1114,1117],{},[1094,1110,1111],{},[677,1112,1113],{},"gpu",[1094,1115,1116],{},"Individual overlays",[1094,1118,1119],{},"Local LLM inference, speech-to-text, image generation",[1076,1121,1122,1127,1132],{},[1094,1123,1124],{},[677,1125,1126],{},"extras",[1094,1128,1129],{},[677,1130,1131],{},"docker-compose.extras.yml",[1094,1133,1134],{},"Document research (Open Notebook), HTTPS reverse proxy (Caddy)",[612,1136,1137],{},"The core tier runs on any machine with no GPU required. The GPU tier is split into individual Docker Compose overlays (one per service) so you can run exactly what fits on your hardware. On my 8 GB GPU, running Ollama and ComfyUI at the same time will not work, so the scripts let me swap between them with a single command. I must admit that, although I touched ComfyUI in the past, I haven't yet generated images with this workflow. But more on these features below.",[645,1139,512],{"id":1140},"core-tier",[1142,1143,516],"h3",{"id":1144},"open-webui-the-frontend",[612,1146,1147,1152],{},[617,1148,1151],{"href":1149,"rel":1150},"https://github.com/open-webui/open-webui",[657],"Open WebUI"," is the chat interface. It looks and feels like ChatGPT but connects entirely to your own backends. You get conversation history with folders and search, per-message web search, image generation in chat, voice input, file uploads with RAG, and full user management with admin roles.",[612,1154,1155],{},"One thing to be aware of: the first user to register becomes the admin.",[1142,1157,521],{"id":1158},"litellm-the-model-proxy",[612,1160,1161,1166],{},[617,1162,1165],{"href":1163,"rel":1164},"https://github.com/BerriAI/litellm",[657],"LiteLLM"," sits between Open WebUI and every AI provider. OpenAI, Ollama local models, Azure AI Foundry, Anthropic, and 100+ more all sit behind a single endpoint. Open WebUI only talks to LiteLLM; switching or adding models is a config change, not a code change.",[612,1168,1169],{},"It also handles cost-based routing (cheap requests go to cheap models automatically), Redis caching for repeated responses, and sends every call to Langfuse for tracing. Honestly, the feature set is way more than this stack needs, but it's fun to explore the capabilities of this software.",[612,1171,1172,1173,1176],{},"The current config covers Azure OpenAI GPT models and GPT Image via Azure AI Foundry. When Ollama is started, the scripts automatically swap LiteLLM to a second config file (",[677,1174,1175],{},"config.local.yaml",") that adds local models to the routing table. When Ollama stops, LiteLLM switches back to the cloud-only config so you don't see broken model entries in the UI. Unfortunately this causes some duplicate configurations, but it is the least hassle for now.",[612,1178,1179],{},"Example config for the 'cloud based' configuration:",[1181,1182,1183],"code-collapse",{},[713,1184,1189],{"className":1185,"code":1186,"filename":1187,"language":1188,"meta":449,"style":449},"language-yaml shiki shiki-themes github-light github-dark github-dark","# =============================================================================\n# LiteLLM Proxy Configuration\n# Docs: https://docs.litellm.ai/docs/proxy/configs\n# =============================================================================\n#\n# This config defines all available models routed through LiteLLM.\n# Models appear in OpenWebUI's model selector automatically.\n#\n# After editing, restart LiteLLM: docker compose restart litellm\n# =============================================================================\n\n# --- Observability: send all calls to Langfuse ---\nlitellm_settings:\n  drop_params: true\n  set_verbose: false\n  success_callback: [\"langfuse\"]\n  failure_callback: [\"langfuse\"]\n  cache: true\n  cache_params:\n    type: redis\n    host: redis\n    port: 6379\n    password: os.environ/REDIS_PASSWORD\n\n# --- Environment variable references ---\nenvironment_variables:\n  LANGFUSE_PUBLIC_KEY: os.environ/LANGFUSE_PUBLIC_KEY\n  LANGFUSE_SECRET_KEY: os.environ/LANGFUSE_SECRET_KEY\n  LANGFUSE_HOST: os.environ/LANGFUSE_HOST\n\n# --- General proxy settings ---\ngeneral_settings:\n  master_key: os.environ/LITELLM_MASTER_KEY\n  database_url: os.environ/LITELLM_DATABASE_URL\n  alerting: [\"log\"]\n\nmodel_list:\n\n  - model_name: azure/gpt-5.4-nano-2\n    litellm_params:\n      model: azure/gpt-5.4-nano-2            # format: azure/\u003Cyour-deployment-name>\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n\n  - model_name: azure/gpt-5.3-codex\n    litellm_params:\n      model: azure/gpt-5.3-codex\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: preview\n\n  - model_name: azure/gpt-5.4-mini\n    litellm_params:\n      model: azure/gpt-5.4-mini\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n\n  # --- Azure OpenAI Image Generation ---\n  - model_name: azure/gpt-image-1.5\n    litellm_params:\n      model: azure/gpt-image-1.5       # Azure OpenAI deployment name\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n    model_info:\n      mode: image_generation\n\n  # --- Azure AI Foundry Serverless — FLUX.2-pro ---\n  - model_name: azure/flux-2-pro\n    litellm_params:\n      model: azure_ai/FLUX.2-pro\n      api_base: os.environ/AZURE_AI_FOUNDRY_BASE\n      api_key: os.environ/AZURE_API_KEY\n    model_info:\n      mode: image_generation\n\n# =============================================================================\n# ROUTER SETTINGS (load balancing, fallbacks)\n# =============================================================================\nrouter_settings:\n  routing_strategy: \"cost-based-routing\"  # options: \"priority\", \"round_robin\"\n  num_retries: 2\n  timeout: 120\n","config/litellm/config.yaml","yaml",[677,1190,1191,1199,1204,1209,1213,1219,1225,1231,1236,1241,1246,1252,1258,1269,1282,1293,1309,1321,1331,1339,1350,1360,1371,1382,1387,1393,1401,1412,1423,1434,1439,1445,1453,1464,1475,1488,1493,1501,1506,1520,1528,1542,1553,1564,1575,1580,1592,1599,1608,1617,1626,1636,1641,1653,1660,1669,1678,1687,1696,1701,1707,1719,1726,1739,1748,1757,1766,1774,1785,1790,1796,1808,1815,1825,1835,1844,1851,1860,1865,1870,1876,1881,1889,1903,1914],{"__ignoreMap":449},[1192,1193,1195],"span",{"class":1194,"line":50},"line",[1192,1196,1198],{"class":1197},"sCsY4","# =============================================================================\n",[1192,1200,1201],{"class":1194,"line":56},[1192,1202,1203],{"class":1197},"# LiteLLM Proxy Configuration\n",[1192,1205,1206],{"class":1194,"line":92},[1192,1207,1208],{"class":1197},"# Docs: https://docs.litellm.ai/docs/proxy/configs\n",[1192,1210,1211],{"class":1194,"line":103},[1192,1212,1198],{"class":1197},[1192,1214,1216],{"class":1194,"line":1215},5,[1192,1217,1218],{"class":1197},"#\n",[1192,1220,1222],{"class":1194,"line":1221},6,[1192,1223,1224],{"class":1197},"# This config defines all available models routed through LiteLLM.\n",[1192,1226,1228],{"class":1194,"line":1227},7,[1192,1229,1230],{"class":1197},"# Models appear in OpenWebUI's model selector automatically.\n",[1192,1232,1234],{"class":1194,"line":1233},8,[1192,1235,1218],{"class":1197},[1192,1237,1238],{"class":1194,"line":1020},[1192,1239,1240],{"class":1197},"# After editing, restart LiteLLM: docker compose restart litellm\n",[1192,1242,1244],{"class":1194,"line":1243},10,[1192,1245,1198],{"class":1197},[1192,1247,1249],{"class":1194,"line":1248},11,[1192,1250,1251],{"emptyLinePlaceholder":1017},"\n",[1192,1253,1255],{"class":1194,"line":1254},12,[1192,1256,1257],{"class":1197},"# --- Observability: send all calls to Langfuse ---\n",[1192,1259,1261,1265],{"class":1194,"line":1260},13,[1192,1262,1264],{"class":1263},"sByVh","litellm_settings",[1192,1266,1268],{"class":1267},"slsVL",":\n",[1192,1270,1272,1275,1278],{"class":1194,"line":1271},14,[1192,1273,1274],{"class":1263},"  drop_params",[1192,1276,1277],{"class":1267},": ",[1192,1279,1281],{"class":1280},"suiK_","true\n",[1192,1283,1285,1288,1290],{"class":1194,"line":1284},15,[1192,1286,1287],{"class":1263},"  set_verbose",[1192,1289,1277],{"class":1267},[1192,1291,1292],{"class":1280},"false\n",[1192,1294,1296,1299,1302,1306],{"class":1194,"line":1295},16,[1192,1297,1298],{"class":1263},"  success_callback",[1192,1300,1301],{"class":1267},": [",[1192,1303,1305],{"class":1304},"sfrk1","\"langfuse\"",[1192,1307,1308],{"class":1267},"]\n",[1192,1310,1312,1315,1317,1319],{"class":1194,"line":1311},17,[1192,1313,1314],{"class":1263},"  failure_callback",[1192,1316,1301],{"class":1267},[1192,1318,1305],{"class":1304},[1192,1320,1308],{"class":1267},[1192,1322,1324,1327,1329],{"class":1194,"line":1323},18,[1192,1325,1326],{"class":1263},"  cache",[1192,1328,1277],{"class":1267},[1192,1330,1281],{"class":1280},[1192,1332,1334,1337],{"class":1194,"line":1333},19,[1192,1335,1336],{"class":1263},"  cache_params",[1192,1338,1268],{"class":1267},[1192,1340,1342,1345,1347],{"class":1194,"line":1341},20,[1192,1343,1344],{"class":1263},"    type",[1192,1346,1277],{"class":1267},[1192,1348,1349],{"class":1304},"redis\n",[1192,1351,1353,1356,1358],{"class":1194,"line":1352},21,[1192,1354,1355],{"class":1263},"    host",[1192,1357,1277],{"class":1267},[1192,1359,1349],{"class":1304},[1192,1361,1363,1366,1368],{"class":1194,"line":1362},22,[1192,1364,1365],{"class":1263},"    port",[1192,1367,1277],{"class":1267},[1192,1369,1370],{"class":1280},"6379\n",[1192,1372,1374,1377,1379],{"class":1194,"line":1373},23,[1192,1375,1376],{"class":1263},"    password",[1192,1378,1277],{"class":1267},[1192,1380,1381],{"class":1304},"os.environ/REDIS_PASSWORD\n",[1192,1383,1385],{"class":1194,"line":1384},24,[1192,1386,1251],{"emptyLinePlaceholder":1017},[1192,1388,1390],{"class":1194,"line":1389},25,[1192,1391,1392],{"class":1197},"# --- Environment variable references ---\n",[1192,1394,1396,1399],{"class":1194,"line":1395},26,[1192,1397,1398],{"class":1263},"environment_variables",[1192,1400,1268],{"class":1267},[1192,1402,1404,1407,1409],{"class":1194,"line":1403},27,[1192,1405,1406],{"class":1263},"  LANGFUSE_PUBLIC_KEY",[1192,1408,1277],{"class":1267},[1192,1410,1411],{"class":1304},"os.environ/LANGFUSE_PUBLIC_KEY\n",[1192,1413,1415,1418,1420],{"class":1194,"line":1414},28,[1192,1416,1417],{"class":1263},"  LANGFUSE_SECRET_KEY",[1192,1419,1277],{"class":1267},[1192,1421,1422],{"class":1304},"os.environ/LANGFUSE_SECRET_KEY\n",[1192,1424,1426,1429,1431],{"class":1194,"line":1425},29,[1192,1427,1428],{"class":1263},"  LANGFUSE_HOST",[1192,1430,1277],{"class":1267},[1192,1432,1433],{"class":1304},"os.environ/LANGFUSE_HOST\n",[1192,1435,1437],{"class":1194,"line":1436},30,[1192,1438,1251],{"emptyLinePlaceholder":1017},[1192,1440,1442],{"class":1194,"line":1441},31,[1192,1443,1444],{"class":1197},"# --- General proxy settings ---\n",[1192,1446,1448,1451],{"class":1194,"line":1447},32,[1192,1449,1450],{"class":1263},"general_settings",[1192,1452,1268],{"class":1267},[1192,1454,1456,1459,1461],{"class":1194,"line":1455},33,[1192,1457,1458],{"class":1263},"  master_key",[1192,1460,1277],{"class":1267},[1192,1462,1463],{"class":1304},"os.environ/LITELLM_MASTER_KEY\n",[1192,1465,1467,1470,1472],{"class":1194,"line":1466},34,[1192,1468,1469],{"class":1263},"  database_url",[1192,1471,1277],{"class":1267},[1192,1473,1474],{"class":1304},"os.environ/LITELLM_DATABASE_URL\n",[1192,1476,1478,1481,1483,1486],{"class":1194,"line":1477},35,[1192,1479,1480],{"class":1263},"  alerting",[1192,1482,1301],{"class":1267},[1192,1484,1485],{"class":1304},"\"log\"",[1192,1487,1308],{"class":1267},[1192,1489,1491],{"class":1194,"line":1490},36,[1192,1492,1251],{"emptyLinePlaceholder":1017},[1192,1494,1496,1499],{"class":1194,"line":1495},37,[1192,1497,1498],{"class":1263},"model_list",[1192,1500,1268],{"class":1267},[1192,1502,1504],{"class":1194,"line":1503},38,[1192,1505,1251],{"emptyLinePlaceholder":1017},[1192,1507,1509,1512,1515,1517],{"class":1194,"line":1508},39,[1192,1510,1511],{"class":1267},"  - ",[1192,1513,1514],{"class":1263},"model_name",[1192,1516,1277],{"class":1267},[1192,1518,1519],{"class":1304},"azure/gpt-5.4-nano-2\n",[1192,1521,1523,1526],{"class":1194,"line":1522},40,[1192,1524,1525],{"class":1263},"    litellm_params",[1192,1527,1268],{"class":1267},[1192,1529,1531,1534,1536,1539],{"class":1194,"line":1530},41,[1192,1532,1533],{"class":1263},"      model",[1192,1535,1277],{"class":1267},[1192,1537,1538],{"class":1304},"azure/gpt-5.4-nano-2",[1192,1540,1541],{"class":1197},"            # format: azure/\u003Cyour-deployment-name>\n",[1192,1543,1545,1548,1550],{"class":1194,"line":1544},42,[1192,1546,1547],{"class":1263},"      api_base",[1192,1549,1277],{"class":1267},[1192,1551,1552],{"class":1304},"os.environ/AZURE_API_BASE\n",[1192,1554,1556,1559,1561],{"class":1194,"line":1555},43,[1192,1557,1558],{"class":1263},"      api_key",[1192,1560,1277],{"class":1267},[1192,1562,1563],{"class":1304},"os.environ/AZURE_API_KEY\n",[1192,1565,1567,1570,1572],{"class":1194,"line":1566},44,[1192,1568,1569],{"class":1263},"      api_version",[1192,1571,1277],{"class":1267},[1192,1573,1574],{"class":1304},"os.environ/AZURE_API_VERSION\n",[1192,1576,1578],{"class":1194,"line":1577},45,[1192,1579,1251],{"emptyLinePlaceholder":1017},[1192,1581,1583,1585,1587,1589],{"class":1194,"line":1582},46,[1192,1584,1511],{"class":1267},[1192,1586,1514],{"class":1263},[1192,1588,1277],{"class":1267},[1192,1590,1591],{"class":1304},"azure/gpt-5.3-codex\n",[1192,1593,1595,1597],{"class":1194,"line":1594},47,[1192,1596,1525],{"class":1263},[1192,1598,1268],{"class":1267},[1192,1600,1602,1604,1606],{"class":1194,"line":1601},48,[1192,1603,1533],{"class":1263},[1192,1605,1277],{"class":1267},[1192,1607,1591],{"class":1304},[1192,1609,1611,1613,1615],{"class":1194,"line":1610},49,[1192,1612,1547],{"class":1263},[1192,1614,1277],{"class":1267},[1192,1616,1552],{"class":1304},[1192,1618,1620,1622,1624],{"class":1194,"line":1619},50,[1192,1621,1558],{"class":1263},[1192,1623,1277],{"class":1267},[1192,1625,1563],{"class":1304},[1192,1627,1629,1631,1633],{"class":1194,"line":1628},51,[1192,1630,1569],{"class":1263},[1192,1632,1277],{"class":1267},[1192,1634,1635],{"class":1304},"preview\n",[1192,1637,1639],{"class":1194,"line":1638},52,[1192,1640,1251],{"emptyLinePlaceholder":1017},[1192,1642,1644,1646,1648,1650],{"class":1194,"line":1643},53,[1192,1645,1511],{"class":1267},[1192,1647,1514],{"class":1263},[1192,1649,1277],{"class":1267},[1192,1651,1652],{"class":1304},"azure/gpt-5.4-mini\n",[1192,1654,1656,1658],{"class":1194,"line":1655},54,[1192,1657,1525],{"class":1263},[1192,1659,1268],{"class":1267},[1192,1661,1663,1665,1667],{"class":1194,"line":1662},55,[1192,1664,1533],{"class":1263},[1192,1666,1277],{"class":1267},[1192,1668,1652],{"class":1304},[1192,1670,1672,1674,1676],{"class":1194,"line":1671},56,[1192,1673,1547],{"class":1263},[1192,1675,1277],{"class":1267},[1192,1677,1552],{"class":1304},[1192,1679,1681,1683,1685],{"class":1194,"line":1680},57,[1192,1682,1558],{"class":1263},[1192,1684,1277],{"class":1267},[1192,1686,1563],{"class":1304},[1192,1688,1690,1692,1694],{"class":1194,"line":1689},58,[1192,1691,1569],{"class":1263},[1192,1693,1277],{"class":1267},[1192,1695,1574],{"class":1304},[1192,1697,1699],{"class":1194,"line":1698},59,[1192,1700,1251],{"emptyLinePlaceholder":1017},[1192,1702,1704],{"class":1194,"line":1703},60,[1192,1705,1706],{"class":1197},"  # --- Azure OpenAI Image Generation ---\n",[1192,1708,1710,1712,1714,1716],{"class":1194,"line":1709},61,[1192,1711,1511],{"class":1267},[1192,1713,1514],{"class":1263},[1192,1715,1277],{"class":1267},[1192,1717,1718],{"class":1304},"azure/gpt-image-1.5\n",[1192,1720,1722,1724],{"class":1194,"line":1721},62,[1192,1723,1525],{"class":1263},[1192,1725,1268],{"class":1267},[1192,1727,1729,1731,1733,1736],{"class":1194,"line":1728},63,[1192,1730,1533],{"class":1263},[1192,1732,1277],{"class":1267},[1192,1734,1735],{"class":1304},"azure/gpt-image-1.5",[1192,1737,1738],{"class":1197},"       # Azure OpenAI deployment name\n",[1192,1740,1742,1744,1746],{"class":1194,"line":1741},64,[1192,1743,1547],{"class":1263},[1192,1745,1277],{"class":1267},[1192,1747,1552],{"class":1304},[1192,1749,1751,1753,1755],{"class":1194,"line":1750},65,[1192,1752,1558],{"class":1263},[1192,1754,1277],{"class":1267},[1192,1756,1563],{"class":1304},[1192,1758,1760,1762,1764],{"class":1194,"line":1759},66,[1192,1761,1569],{"class":1263},[1192,1763,1277],{"class":1267},[1192,1765,1574],{"class":1304},[1192,1767,1769,1772],{"class":1194,"line":1768},67,[1192,1770,1771],{"class":1263},"    model_info",[1192,1773,1268],{"class":1267},[1192,1775,1777,1780,1782],{"class":1194,"line":1776},68,[1192,1778,1779],{"class":1263},"      mode",[1192,1781,1277],{"class":1267},[1192,1783,1784],{"class":1304},"image_generation\n",[1192,1786,1788],{"class":1194,"line":1787},69,[1192,1789,1251],{"emptyLinePlaceholder":1017},[1192,1791,1793],{"class":1194,"line":1792},70,[1192,1794,1795],{"class":1197},"  # --- Azure AI Foundry Serverless — FLUX.2-pro ---\n",[1192,1797,1799,1801,1803,1805],{"class":1194,"line":1798},71,[1192,1800,1511],{"class":1267},[1192,1802,1514],{"class":1263},[1192,1804,1277],{"class":1267},[1192,1806,1807],{"class":1304},"azure/flux-2-pro\n",[1192,1809,1811,1813],{"class":1194,"line":1810},72,[1192,1812,1525],{"class":1263},[1192,1814,1268],{"class":1267},[1192,1816,1818,1820,1822],{"class":1194,"line":1817},73,[1192,1819,1533],{"class":1263},[1192,1821,1277],{"class":1267},[1192,1823,1824],{"class":1304},"azure_ai/FLUX.2-pro\n",[1192,1826,1828,1830,1832],{"class":1194,"line":1827},74,[1192,1829,1547],{"class":1263},[1192,1831,1277],{"class":1267},[1192,1833,1834],{"class":1304},"os.environ/AZURE_AI_FOUNDRY_BASE\n",[1192,1836,1838,1840,1842],{"class":1194,"line":1837},75,[1192,1839,1558],{"class":1263},[1192,1841,1277],{"class":1267},[1192,1843,1563],{"class":1304},[1192,1845,1847,1849],{"class":1194,"line":1846},76,[1192,1848,1771],{"class":1263},[1192,1850,1268],{"class":1267},[1192,1852,1854,1856,1858],{"class":1194,"line":1853},77,[1192,1855,1779],{"class":1263},[1192,1857,1277],{"class":1267},[1192,1859,1784],{"class":1304},[1192,1861,1863],{"class":1194,"line":1862},78,[1192,1864,1251],{"emptyLinePlaceholder":1017},[1192,1866,1868],{"class":1194,"line":1867},79,[1192,1869,1198],{"class":1197},[1192,1871,1873],{"class":1194,"line":1872},80,[1192,1874,1875],{"class":1197},"# ROUTER SETTINGS (load balancing, fallbacks)\n",[1192,1877,1879],{"class":1194,"line":1878},81,[1192,1880,1198],{"class":1197},[1192,1882,1884,1887],{"class":1194,"line":1883},82,[1192,1885,1886],{"class":1263},"router_settings",[1192,1888,1268],{"class":1267},[1192,1890,1892,1895,1897,1900],{"class":1194,"line":1891},83,[1192,1893,1894],{"class":1263},"  routing_strategy",[1192,1896,1277],{"class":1267},[1192,1898,1899],{"class":1304},"\"cost-based-routing\"",[1192,1901,1902],{"class":1197},"  # options: \"priority\", \"round_robin\"\n",[1192,1904,1906,1909,1911],{"class":1194,"line":1905},84,[1192,1907,1908],{"class":1263},"  num_retries",[1192,1910,1277],{"class":1267},[1192,1912,1913],{"class":1280},"2\n",[1192,1915,1917,1920,1922],{"class":1194,"line":1916},85,[1192,1918,1919],{"class":1263},"  timeout",[1192,1921,1277],{"class":1267},[1192,1923,1924],{"class":1280},"120\n",[612,1926,1927],{},"And the local config, with the added Ollama models:",[1181,1929,1930],{},[713,1931,1934],{"className":1185,"code":1932,"filename":1933,"language":1188,"meta":449,"style":449},"# =============================================================================\n# LiteLLM Proxy Configuration — Local Models Enabled\n# Docs: https://docs.litellm.ai/docs/proxy/configs\n# =============================================================================\n#\n# This config extends the cloud-only config.yaml with local Ollama models.\n# It is mounted automatically when you run:\n#   .\\scripts\\start.ps1 gpu-start ollama\n#\n# Pull models first:\n#   docker exec ai-ollama ollama pull gemma4\n#   docker exec ai-ollama ollama pull gemma4:e2b\n#   docker exec ai-ollama ollama pull llama3.2\n# =============================================================================\n\n# --- Observability: send all calls to Langfuse ---\nlitellm_settings:\n  drop_params: true\n  set_verbose: false\n  success_callback: [\"langfuse\"]\n  failure_callback: [\"langfuse\"]\n  cache: true\n  cache_params:\n    type: redis\n    host: redis\n    port: 6379\n    password: os.environ/REDIS_PASSWORD\n\n# --- Environment variable references ---\nenvironment_variables:\n  LANGFUSE_PUBLIC_KEY: os.environ/LANGFUSE_PUBLIC_KEY\n  LANGFUSE_SECRET_KEY: os.environ/LANGFUSE_SECRET_KEY\n  LANGFUSE_HOST: os.environ/LANGFUSE_HOST\n\n# --- General proxy settings ---\ngeneral_settings:\n  master_key: os.environ/LITELLM_MASTER_KEY\n  database_url: os.environ/LITELLM_DATABASE_URL\n  alerting: [\"log\"]\n\n# =============================================================================\n# MODEL DEFINITIONS\n# =============================================================================\n\nmodel_list:\n\n  - model_name: azure/gpt-5.4-nano-2\n    litellm_params:\n      model: azure/gpt-5.4-nano-2\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n\n  - model_name: azure/gpt-5.3-codex\n    litellm_params:\n      model: azure/gpt-5.3-codex\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: preview\n\n  - model_name: azure/gpt-5.4-mini\n    litellm_params:\n      model: azure/gpt-5.4-mini\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n\n  - model_name: azure/gpt-image-1.5\n    litellm_params:\n      model: azure/gpt-image-1.5\n      api_base: os.environ/AZURE_API_BASE\n      api_key: os.environ/AZURE_API_KEY\n      api_version: os.environ/AZURE_API_VERSION\n    model_info:\n      mode: image_generation\n\n  - model_name: azure/flux-2-pro\n    litellm_params:\n      model: azure_ai/FLUX.2-pro\n      api_base: os.environ/AZURE_AI_FOUNDRY_BASE\n      api_key: os.environ/AZURE_API_KEY\n    model_info:\n      mode: image_generation\n\n  - model_name: gemma4:e2b\n    litellm_params:\n      model: ollama/gemma4:e2b\n      api_base: http://ollama:11434\n\n  - model_name: qwen3.5:9b\n    litellm_params:\n      model: ollama/qwen3.5:9b\n      api_base: http://ollama:11434\n\n\n# =============================================================================\n# ROUTER SETTINGS (load balancing, fallbacks)\n# =============================================================================\nrouter_settings:\n  routing_strategy: \"cost-based-routing\"\n  num_retries: 2\n  timeout: 120\n","config/litellm/config.local.yaml",[677,1935,1936,1940,1945,1949,1953,1957,1962,1967,1972,1976,1981,1986,1991,1996,2000,2004,2008,2014,2022,2030,2040,2050,2058,2064,2072,2080,2088,2096,2100,2104,2110,2118,2126,2134,2138,2142,2148,2156,2164,2174,2178,2182,2187,2191,2195,2201,2205,2215,2221,2229,2237,2245,2253,2257,2267,2273,2281,2289,2297,2305,2309,2319,2325,2333,2341,2349,2357,2361,2371,2377,2385,2393,2401,2409,2415,2423,2427,2437,2443,2451,2459,2467,2473,2481,2485,2496,2503,2513,2523,2528,2540,2547,2557,2566,2571,2576,2581,2586,2591,2598,2608,2617],{"__ignoreMap":449},[1192,1937,1938],{"class":1194,"line":50},[1192,1939,1198],{"class":1197},[1192,1941,1942],{"class":1194,"line":56},[1192,1943,1944],{"class":1197},"# LiteLLM Proxy Configuration — Local Models Enabled\n",[1192,1946,1947],{"class":1194,"line":92},[1192,1948,1208],{"class":1197},[1192,1950,1951],{"class":1194,"line":103},[1192,1952,1198],{"class":1197},[1192,1954,1955],{"class":1194,"line":1215},[1192,1956,1218],{"class":1197},[1192,1958,1959],{"class":1194,"line":1221},[1192,1960,1961],{"class":1197},"# This config extends the cloud-only config.yaml with local Ollama models.\n",[1192,1963,1964],{"class":1194,"line":1227},[1192,1965,1966],{"class":1197},"# It is mounted automatically when you run:\n",[1192,1968,1969],{"class":1194,"line":1233},[1192,1970,1971],{"class":1197},"#   .\\scripts\\start.ps1 gpu-start ollama\n",[1192,1973,1974],{"class":1194,"line":1020},[1192,1975,1218],{"class":1197},[1192,1977,1978],{"class":1194,"line":1243},[1192,1979,1980],{"class":1197},"# Pull models first:\n",[1192,1982,1983],{"class":1194,"line":1248},[1192,1984,1985],{"class":1197},"#   docker exec ai-ollama ollama pull gemma4\n",[1192,1987,1988],{"class":1194,"line":1254},[1192,1989,1990],{"class":1197},"#   docker exec ai-ollama ollama pull gemma4:e2b\n",[1192,1992,1993],{"class":1194,"line":1260},[1192,1994,1995],{"class":1197},"#   docker exec ai-ollama ollama pull llama3.2\n",[1192,1997,1998],{"class":1194,"line":1271},[1192,1999,1198],{"class":1197},[1192,2001,2002],{"class":1194,"line":1284},[1192,2003,1251],{"emptyLinePlaceholder":1017},[1192,2005,2006],{"class":1194,"line":1295},[1192,2007,1257],{"class":1197},[1192,2009,2010,2012],{"class":1194,"line":1311},[1192,2011,1264],{"class":1263},[1192,2013,1268],{"class":1267},[1192,2015,2016,2018,2020],{"class":1194,"line":1323},[1192,2017,1274],{"class":1263},[1192,2019,1277],{"class":1267},[1192,2021,1281],{"class":1280},[1192,2023,2024,2026,2028],{"class":1194,"line":1333},[1192,2025,1287],{"class":1263},[1192,2027,1277],{"class":1267},[1192,2029,1292],{"class":1280},[1192,2031,2032,2034,2036,2038],{"class":1194,"line":1341},[1192,2033,1298],{"class":1263},[1192,2035,1301],{"class":1267},[1192,2037,1305],{"class":1304},[1192,2039,1308],{"class":1267},[1192,2041,2042,2044,2046,2048],{"class":1194,"line":1352},[1192,2043,1314],{"class":1263},[1192,2045,1301],{"class":1267},[1192,2047,1305],{"class":1304},[1192,2049,1308],{"class":1267},[1192,2051,2052,2054,2056],{"class":1194,"line":1362},[1192,2053,1326],{"class":1263},[1192,2055,1277],{"class":1267},[1192,2057,1281],{"class":1280},[1192,2059,2060,2062],{"class":1194,"line":1373},[1192,2061,1336],{"class":1263},[1192,2063,1268],{"class":1267},[1192,2065,2066,2068,2070],{"class":1194,"line":1384},[1192,2067,1344],{"class":1263},[1192,2069,1277],{"class":1267},[1192,2071,1349],{"class":1304},[1192,2073,2074,2076,2078],{"class":1194,"line":1389},[1192,2075,1355],{"class":1263},[1192,2077,1277],{"class":1267},[1192,2079,1349],{"class":1304},[1192,2081,2082,2084,2086],{"class":1194,"line":1395},[1192,2083,1365],{"class":1263},[1192,2085,1277],{"class":1267},[1192,2087,1370],{"class":1280},[1192,2089,2090,2092,2094],{"class":1194,"line":1403},[1192,2091,1376],{"class":1263},[1192,2093,1277],{"class":1267},[1192,2095,1381],{"class":1304},[1192,2097,2098],{"class":1194,"line":1414},[1192,2099,1251],{"emptyLinePlaceholder":1017},[1192,2101,2102],{"class":1194,"line":1425},[1192,2103,1392],{"class":1197},[1192,2105,2106,2108],{"class":1194,"line":1436},[1192,2107,1398],{"class":1263},[1192,2109,1268],{"class":1267},[1192,2111,2112,2114,2116],{"class":1194,"line":1441},[1192,2113,1406],{"class":1263},[1192,2115,1277],{"class":1267},[1192,2117,1411],{"class":1304},[1192,2119,2120,2122,2124],{"class":1194,"line":1447},[1192,2121,1417],{"class":1263},[1192,2123,1277],{"class":1267},[1192,2125,1422],{"class":1304},[1192,2127,2128,2130,2132],{"class":1194,"line":1455},[1192,2129,1428],{"class":1263},[1192,2131,1277],{"class":1267},[1192,2133,1433],{"class":1304},[1192,2135,2136],{"class":1194,"line":1466},[1192,2137,1251],{"emptyLinePlaceholder":1017},[1192,2139,2140],{"class":1194,"line":1477},[1192,2141,1444],{"class":1197},[1192,2143,2144,2146],{"class":1194,"line":1490},[1192,2145,1450],{"class":1263},[1192,2147,1268],{"class":1267},[1192,2149,2150,2152,2154],{"class":1194,"line":1495},[1192,2151,1458],{"class":1263},[1192,2153,1277],{"class":1267},[1192,2155,1463],{"class":1304},[1192,2157,2158,2160,2162],{"class":1194,"line":1503},[1192,2159,1469],{"class":1263},[1192,2161,1277],{"class":1267},[1192,2163,1474],{"class":1304},[1192,2165,2166,2168,2170,2172],{"class":1194,"line":1508},[1192,2167,1480],{"class":1263},[1192,2169,1301],{"class":1267},[1192,2171,1485],{"class":1304},[1192,2173,1308],{"class":1267},[1192,2175,2176],{"class":1194,"line":1522},[1192,2177,1251],{"emptyLinePlaceholder":1017},[1192,2179,2180],{"class":1194,"line":1530},[1192,2181,1198],{"class":1197},[1192,2183,2184],{"class":1194,"line":1544},[1192,2185,2186],{"class":1197},"# MODEL DEFINITIONS\n",[1192,2188,2189],{"class":1194,"line":1555},[1192,2190,1198],{"class":1197},[1192,2192,2193],{"class":1194,"line":1566},[1192,2194,1251],{"emptyLinePlaceholder":1017},[1192,2196,2197,2199],{"class":1194,"line":1577},[1192,2198,1498],{"class":1263},[1192,2200,1268],{"class":1267},[1192,2202,2203],{"class":1194,"line":1582},[1192,2204,1251],{"emptyLinePlaceholder":1017},[1192,2206,2207,2209,2211,2213],{"class":1194,"line":1594},[1192,2208,1511],{"class":1267},[1192,2210,1514],{"class":1263},[1192,2212,1277],{"class":1267},[1192,2214,1519],{"class":1304},[1192,2216,2217,2219],{"class":1194,"line":1601},[1192,2218,1525],{"class":1263},[1192,2220,1268],{"class":1267},[1192,2222,2223,2225,2227],{"class":1194,"line":1610},[1192,2224,1533],{"class":1263},[1192,2226,1277],{"class":1267},[1192,2228,1519],{"class":1304},[1192,2230,2231,2233,2235],{"class":1194,"line":1619},[1192,2232,1547],{"class":1263},[1192,2234,1277],{"class":1267},[1192,2236,1552],{"class":1304},[1192,2238,2239,2241,2243],{"class":1194,"line":1628},[1192,2240,1558],{"class":1263},[1192,2242,1277],{"class":1267},[1192,2244,1563],{"class":1304},[1192,2246,2247,2249,2251],{"class":1194,"line":1638},[1192,2248,1569],{"class":1263},[1192,2250,1277],{"class":1267},[1192,2252,1574],{"class":1304},[1192,2254,2255],{"class":1194,"line":1643},[1192,2256,1251],{"emptyLinePlaceholder":1017},[1192,2258,2259,2261,2263,2265],{"class":1194,"line":1655},[1192,2260,1511],{"class":1267},[1192,2262,1514],{"class":1263},[1192,2264,1277],{"class":1267},[1192,2266,1591],{"class":1304},[1192,2268,2269,2271],{"class":1194,"line":1662},[1192,2270,1525],{"class":1263},[1192,2272,1268],{"class":1267},[1192,2274,2275,2277,2279],{"class":1194,"line":1671},[1192,2276,1533],{"class":1263},[1192,2278,1277],{"class":1267},[1192,2280,1591],{"class":1304},[1192,2282,2283,2285,2287],{"class":1194,"line":1680},[1192,2284,1547],{"class":1263},[1192,2286,1277],{"class":1267},[1192,2288,1552],{"class":1304},[1192,2290,2291,2293,2295],{"class":1194,"line":1689},[1192,2292,1558],{"class":1263},[1192,2294,1277],{"class":1267},[1192,2296,1563],{"class":1304},[1192,2298,2299,2301,2303],{"class":1194,"line":1698},[1192,2300,1569],{"class":1263},[1192,2302,1277],{"class":1267},[1192,2304,1635],{"class":1304},[1192,2306,2307],{"class":1194,"line":1703},[1192,2308,1251],{"emptyLinePlaceholder":1017},[1192,2310,2311,2313,2315,2317],{"class":1194,"line":1709},[1192,2312,1511],{"class":1267},[1192,2314,1514],{"class":1263},[1192,2316,1277],{"class":1267},[1192,2318,1652],{"class":1304},[1192,2320,2321,2323],{"class":1194,"line":1721},[1192,2322,1525],{"class":1263},[1192,2324,1268],{"class":1267},[1192,2326,2327,2329,2331],{"class":1194,"line":1728},[1192,2328,1533],{"class":1263},[1192,2330,1277],{"class":1267},[1192,2332,1652],{"class":1304},[1192,2334,2335,2337,2339],{"class":1194,"line":1741},[1192,2336,1547],{"class":1263},[1192,2338,1277],{"class":1267},[1192,2340,1552],{"class":1304},[1192,2342,2343,2345,2347],{"class":1194,"line":1750},[1192,2344,1558],{"class":1263},[1192,2346,1277],{"class":1267},[1192,2348,1563],{"class":1304},[1192,2350,2351,2353,2355],{"class":1194,"line":1759},[1192,2352,1569],{"class":1263},[1192,2354,1277],{"class":1267},[1192,2356,1574],{"class":1304},[1192,2358,2359],{"class":1194,"line":1768},[1192,2360,1251],{"emptyLinePlaceholder":1017},[1192,2362,2363,2365,2367,2369],{"class":1194,"line":1776},[1192,2364,1511],{"class":1267},[1192,2366,1514],{"class":1263},[1192,2368,1277],{"class":1267},[1192,2370,1718],{"class":1304},[1192,2372,2373,2375],{"class":1194,"line":1787},[1192,2374,1525],{"class":1263},[1192,2376,1268],{"class":1267},[1192,2378,2379,2381,2383],{"class":1194,"line":1792},[1192,2380,1533],{"class":1263},[1192,2382,1277],{"class":1267},[1192,2384,1718],{"class":1304},[1192,2386,2387,2389,2391],{"class":1194,"line":1798},[1192,2388,1547],{"class":1263},[1192,2390,1277],{"class":1267},[1192,2392,1552],{"class":1304},[1192,2394,2395,2397,2399],{"class":1194,"line":1810},[1192,2396,1558],{"class":1263},[1192,2398,1277],{"class":1267},[1192,2400,1563],{"class":1304},[1192,2402,2403,2405,2407],{"class":1194,"line":1817},[1192,2404,1569],{"class":1263},[1192,2406,1277],{"class":1267},[1192,2408,1574],{"class":1304},[1192,2410,2411,2413],{"class":1194,"line":1827},[1192,2412,1771],{"class":1263},[1192,2414,1268],{"class":1267},[1192,2416,2417,2419,2421],{"class":1194,"line":1837},[1192,2418,1779],{"class":1263},[1192,2420,1277],{"class":1267},[1192,2422,1784],{"class":1304},[1192,2424,2425],{"class":1194,"line":1846},[1192,2426,1251],{"emptyLinePlaceholder":1017},[1192,2428,2429,2431,2433,2435],{"class":1194,"line":1853},[1192,2430,1511],{"class":1267},[1192,2432,1514],{"class":1263},[1192,2434,1277],{"class":1267},[1192,2436,1807],{"class":1304},[1192,2438,2439,2441],{"class":1194,"line":1862},[1192,2440,1525],{"class":1263},[1192,2442,1268],{"class":1267},[1192,2444,2445,2447,2449],{"class":1194,"line":1867},[1192,2446,1533],{"class":1263},[1192,2448,1277],{"class":1267},[1192,2450,1824],{"class":1304},[1192,2452,2453,2455,2457],{"class":1194,"line":1872},[1192,2454,1547],{"class":1263},[1192,2456,1277],{"class":1267},[1192,2458,1834],{"class":1304},[1192,2460,2461,2463,2465],{"class":1194,"line":1878},[1192,2462,1558],{"class":1263},[1192,2464,1277],{"class":1267},[1192,2466,1563],{"class":1304},[1192,2468,2469,2471],{"class":1194,"line":1883},[1192,2470,1771],{"class":1263},[1192,2472,1268],{"class":1267},[1192,2474,2475,2477,2479],{"class":1194,"line":1891},[1192,2476,1779],{"class":1263},[1192,2478,1277],{"class":1267},[1192,2480,1784],{"class":1304},[1192,2482,2483],{"class":1194,"line":1905},[1192,2484,1251],{"emptyLinePlaceholder":1017},[1192,2486,2487,2489,2491,2493],{"class":1194,"line":1916},[1192,2488,1511],{"class":1267},[1192,2490,1514],{"class":1263},[1192,2492,1277],{"class":1267},[1192,2494,2495],{"class":1304},"gemma4:e2b\n",[1192,2497,2499,2501],{"class":1194,"line":2498},86,[1192,2500,1525],{"class":1263},[1192,2502,1268],{"class":1267},[1192,2504,2506,2508,2510],{"class":1194,"line":2505},87,[1192,2507,1533],{"class":1263},[1192,2509,1277],{"class":1267},[1192,2511,2512],{"class":1304},"ollama/gemma4:e2b\n",[1192,2514,2516,2518,2520],{"class":1194,"line":2515},88,[1192,2517,1547],{"class":1263},[1192,2519,1277],{"class":1267},[1192,2521,2522],{"class":1304},"http://ollama:11434\n",[1192,2524,2526],{"class":1194,"line":2525},89,[1192,2527,1251],{"emptyLinePlaceholder":1017},[1192,2529,2531,2533,2535,2537],{"class":1194,"line":2530},90,[1192,2532,1511],{"class":1267},[1192,2534,1514],{"class":1263},[1192,2536,1277],{"class":1267},[1192,2538,2539],{"class":1304},"qwen3.5:9b\n",[1192,2541,2543,2545],{"class":1194,"line":2542},91,[1192,2544,1525],{"class":1263},[1192,2546,1268],{"class":1267},[1192,2548,2550,2552,2554],{"class":1194,"line":2549},92,[1192,2551,1533],{"class":1263},[1192,2553,1277],{"class":1267},[1192,2555,2556],{"class":1304},"ollama/qwen3.5:9b\n",[1192,2558,2560,2562,2564],{"class":1194,"line":2559},93,[1192,2561,1547],{"class":1263},[1192,2563,1277],{"class":1267},[1192,2565,2522],{"class":1304},[1192,2567,2569],{"class":1194,"line":2568},94,[1192,2570,1251],{"emptyLinePlaceholder":1017},[1192,2572,2574],{"class":1194,"line":2573},95,[1192,2575,1251],{"emptyLinePlaceholder":1017},[1192,2577,2579],{"class":1194,"line":2578},96,[1192,2580,1198],{"class":1197},[1192,2582,2584],{"class":1194,"line":2583},97,[1192,2585,1875],{"class":1197},[1192,2587,2589],{"class":1194,"line":2588},98,[1192,2590,1198],{"class":1197},[1192,2592,2594,2596],{"class":1194,"line":2593},99,[1192,2595,1886],{"class":1263},[1192,2597,1268],{"class":1267},[1192,2599,2601,2603,2605],{"class":1194,"line":2600},100,[1192,2602,1894],{"class":1263},[1192,2604,1277],{"class":1267},[1192,2606,2607],{"class":1304},"\"cost-based-routing\"\n",[1192,2609,2611,2613,2615],{"class":1194,"line":2610},101,[1192,2612,1908],{"class":1263},[1192,2614,1277],{"class":1267},[1192,2616,1913],{"class":1280},[1192,2618,2620,2622,2624],{"class":1194,"line":2619},102,[1192,2621,1919],{"class":1263},[1192,2623,1277],{"class":1267},[1192,2625,1924],{"class":1280},[1142,2627,526],{"id":2628},"searxng-private-web-search",[612,2630,2631,2636],{},[617,2632,2635],{"href":2633,"rel":2634},"https://github.com/searxng/searxng",[657],"SearXNG"," is a self-hosted meta-search engine that queries Bing, Google, DuckDuckGo and others simultaneously without exposing your identity to any of them.",[612,2638,2639],{},"In Open WebUI, clicking the globe icon on any message triggers a SearXNG search and injects results into the prompt context. The model gets current information; the search engines get an anonymous request. No API keys, no tracking, no per-search billing. This is how you give your LLM web access without giving away your data.",[612,2641,2642],{},"The setup can be finnicky, but I have found a good mix of speed and accuracy. It's in the repo so check it out.",[645,2644,531],{"id":2645},"gpu-tier-per-service-overlays",[612,2647,2648,2649,2652],{},"The original version of this stack had a single ",[677,2650,2651],{},"docker-compose.gpu.yml"," that started Ollama, Whisper, and ComfyUI together. That works fine if you have 16+ GB of VRAM, but on my 8GB RTX 4070 GPU it was not proving impossible to run multiple models at the same time.",[612,2654,2655,2656,2659,2660,2663,2664,2667],{},"The fix: each GPU service now has its own Docker Compose overlay file. The management scripts support ",[677,2657,2658],{},"gpu-start",", ",[677,2661,2662],{},"gpu-stop",", and ",[677,2665,2666],{},"gpu-switch"," commands that bring individual services up or down without touching the core stack.",[1070,2669,2670,2683],{},[1073,2671,2672],{},[1076,2673,2674,2677,2680],{},[1079,2675,2676],{},"Overlay",[1079,2678,2679],{},"Service",[1079,2681,2682],{},"VRAM usage",[1089,2684,2685,2698,2715,2728],{},[1076,2686,2687,2692,2695],{},[1094,2688,2689],{},[677,2690,2691],{},"docker-compose.ollama.yml",[1094,2693,2694],{},"Ollama (local LLMs)",[1094,2696,2697],{},"Depends on model (~7-10 GB for Gemma 4)",[1076,2699,2700,2705,2708],{},[1094,2701,2702],{},[677,2703,2704],{},"docker-compose.whisper.yml",[1094,2706,2707],{},"Whisper (speech-to-text)",[1094,2709,2710,2711,2714],{},"~1 GB on ",[677,2712,2713],{},"base"," model",[1076,2716,2717,2722,2725],{},[1094,2718,2719],{},[677,2720,2721],{},"docker-compose.comfyui.yml",[1094,2723,2724],{},"ComfyUI (image generation)",[1094,2726,2727],{},"~6.5 GB for SDXL",[1076,2729,2730,2735,2738],{},[1094,2731,2732],{},[677,2733,2734],{},"docker-compose.litellm-local.yml",[1094,2736,2737],{},"LiteLLM config swap",[1094,2739,2740],{},"—",[612,2742,2743],{},"The practical workflow on an 8 GB GPU:",[713,2745,2749],{"className":2746,"code":2747,"language":2748,"meta":449,"style":449},"language-powershell shiki shiki-themes github-light github-dark github-dark","# Day-to-day: core stack + Ollama for private local chat\n.\\scripts\\start.ps1 up core\n.\\scripts\\start.ps1 gpu-start ollama\n","powershell",[677,2750,2751,2756,2761],{"__ignoreMap":449},[1192,2752,2753],{"class":1194,"line":50},[1192,2754,2755],{"class":1197},"# Day-to-day: core stack + Ollama for private local chat\n",[1192,2757,2758],{"class":1194,"line":56},[1192,2759,2760],{"class":1267},".\\scripts\\start.ps1 up core\n",[1192,2762,2763,2766,2770],{"class":1194,"line":92},[1192,2764,2765],{"class":1267},".\\scripts\\start.ps1 gpu",[1192,2767,2769],{"class":2768},"so5gQ","-",[1192,2771,2772],{"class":1267},"start ollama\n",[612,2774,2775],{},"Need to generate images? Stop Ollama, start ComfyUI:",[713,2777,2779],{"className":2746,"code":2778,"language":2748,"meta":449,"style":449},".\\scripts\\start.ps1 gpu-switch comfyui\n",[677,2780,2781],{"__ignoreMap":449},[1192,2782,2783,2785,2787],{"class":1194,"line":50},[1192,2784,2765],{"class":1267},[1192,2786,2769],{"class":2768},[1192,2788,2789],{"class":1267},"switch comfyui\n",[612,2791,2792],{},"Done with images, back to local LLMs:",[713,2794,2796],{"className":2746,"code":2795,"language":2748,"meta":449,"style":449},".\\scripts\\start.ps1 gpu-switch ollama\n",[677,2797,2798],{"__ignoreMap":449},[1192,2799,2800,2802,2804],{"class":1194,"line":50},[1192,2801,2765],{"class":1267},[1192,2803,2769],{"class":2768},[1192,2805,2806],{"class":1267},"switch ollama\n",[612,2808,2809],{},"Whisper is lightweight enough to run alongside Ollama:",[713,2811,2813],{"className":2746,"code":2812,"language":2748,"meta":449,"style":449},".\\scripts\\start.ps1 gpu-start whisper\n",[677,2814,2815],{"__ignoreMap":449},[1192,2816,2817,2819,2821],{"class":1194,"line":50},[1192,2818,2765],{"class":1267},[1192,2820,2769],{"class":2768},[1192,2822,2823],{"class":1267},"start whisper\n",[612,2825,2826,2827,2829],{},"The ",[677,2828,2666],{}," command handles the handoff: it stops the conflicting service first, frees the VRAM, then starts the new one. Note: it may take a few minutes for models to unload and load back in so don't swap too much.",[1142,2831,536],{"id":2832},"ollama-local-llms-zero-data-sharing",[612,2834,2835,2840],{},[617,2836,2839],{"href":2837,"rel":2838},"https://github.com/ollama/ollama",[657],"Ollama"," is the reason the GPU tier exists. It runs open-weight models locally — no API keys, no usage tracking, no data leaving your machine. Every prompt and every response stays on your hardware.",[612,2842,2843],{},"This matters more than it sounds. When I'm working on something sensitive like a client proposal, internal architecture notes, code with proprietary logic, I switch to a local model instead of sending it to OpenAI or Azure as per my company's policy.",[612,2845,2846,2847,2850],{},"The model I've been running since last week is ",[767,2848,2849],{},"Gemma 4"," from Google DeepMind, and honestly, it holds its own against ChatGPT for the tasks I throw at it. Another good contender is Qwen 3.5",[1142,2852,541],{"id":2853},"gemma-4-frontier-intelligence-on-a-single-gpu",[612,2855,2856,2860],{},[617,2857,2849],{"href":2858,"rel":2859},"https://ollama.com/library/gemma4",[657]," is Google DeepMind's latest open model family and it punches well above its weight class. The E4B variant (4.5 billion effective parameters) fits on a single consumer GPU and delivers reasoning, coding, and multimodal understanding that genuinely competes with cloud models I'm paying for.",[612,2862,2863],{},"What makes Gemma 4 stand out:",[685,2865,2866,2872,2878,2884,2890],{},[688,2867,2868,2871],{},[767,2869,2870],{},"Multimodal"," — processes both text and images, so I can paste screenshots into the chat",[688,2873,2874,2877],{},[767,2875,2876],{},"128K context window"," — long enough for full architecture documents or meeting transcripts",[688,2879,2880,2883],{},[767,2881,2882],{},"Configurable thinking mode"," — can show its reasoning chain or just give the answer",[688,2885,2886,2889],{},[767,2887,2888],{},"Native function calling"," — supports agentic workflows and tool use",[688,2891,2892,2895],{},[767,2893,2894],{},"Two edge sizes"," — E2B (2.3B effective, 7.2 GB) fits comfortably on 8 GB; E4B (4.5B effective, 9.6 GB) uses most of it",[612,2897,2898],{},"For drafting text, reviewing code, answering questions about documents Gemma 4 E4B gives me results comparable to what I get from GPT-5.4-mini through Azure. The reasoning benchmarks back this up: 69.4% on MMLU Pro, 42.5% on AIME 2026, and 52% on LiveCodeBench v6. These aren't frontier-model numbers, but for a model that runs entirely on my own GPU with zero latency to the cloud, it's remarkable.",[612,2900,2901],{},"I pull it with:",[713,2903,2907],{"className":2904,"code":2905,"language":2906,"meta":449,"style":449},"language-bash shiki shiki-themes github-light github-dark github-dark","docker exec ai-ollama ollama pull gemma4          # E4B — default, needs most of 8 GB\ndocker exec ai-ollama ollama pull gemma4:e2b      # E2B — lighter, comfortable on 8 GB\n","bash",[677,2908,2909,2933],{"__ignoreMap":449},[1192,2910,2911,2915,2918,2921,2924,2927,2930],{"class":1194,"line":50},[1192,2912,2914],{"class":2913},"shcOC","docker",[1192,2916,2917],{"class":1304}," exec",[1192,2919,2920],{"class":1304}," ai-ollama",[1192,2922,2923],{"class":1304}," ollama",[1192,2925,2926],{"class":1304}," pull",[1192,2928,2929],{"class":1304}," gemma4",[1192,2931,2932],{"class":1197},"          # E4B — default, needs most of 8 GB\n",[1192,2934,2935,2937,2939,2941,2943,2945,2948],{"class":1194,"line":56},[1192,2936,2914],{"class":2913},[1192,2938,2917],{"class":1304},[1192,2940,2920],{"class":1304},[1192,2942,2923],{"class":1304},[1192,2944,2926],{"class":1304},[1192,2946,2947],{"class":1304}," gemma4:e2b",[1192,2949,2950],{"class":1197},"      # E2B — lighter, comfortable on 8 GB\n",[612,2952,2953,2954,680],{},"Both appear in Open WebUI automatically once Ollama is started via ",[677,2955,2956],{},"gpu-start ollama",[612,2958,2959],{},[621,2960],{"alt":2961,"src":2962},"Gemma4:e2b","/images/blog/ai-stack/gemma4.png",[1142,2964,546],{"id":2965},"whisper-speech-to-text",[612,2967,2968,2969,2974],{},"The Whisper overlay adds a dedicated ",[617,2970,2973],{"href":2971,"rel":2972},"https://github.com/ahmetoner/whisper-asr-webservice",[657],"Whisper ASR service"," for processing microphone input from Open WebUI. GPU acceleration makes transcription near real-time even on the large-v3 model.",[612,2976,2977,2978,2981,2982,2985],{},"The overlay also reconfigures OpenWebUI automatically — when you run ",[677,2979,2980],{},"gpu-start whisper",", OpenWebUI is recreated with the STT environment variables pointing at the Whisper service. When you run ",[677,2983,2984],{},"gpu-stop whisper",", OpenWebUI goes back to its default (no dedicated STT).",[713,2987,2989],{"className":2904,"code":2988,"language":2906,"meta":449,"style":449},"# .env — choose your accuracy/speed tradeoff\nWHISPER_ASR_MODEL=large-v3   # best accuracy\n# WHISPER_ASR_MODEL=medium   # faster\n# WHISPER_ASR_MODEL=base     # fastest\n",[677,2990,2991,2996,3010,3015],{"__ignoreMap":449},[1192,2992,2993],{"class":1194,"line":50},[1192,2994,2995],{"class":1197},"# .env — choose your accuracy/speed tradeoff\n",[1192,2997,2998,3001,3004,3007],{"class":1194,"line":56},[1192,2999,3000],{"class":1267},"WHISPER_ASR_MODEL",[1192,3002,3003],{"class":2768},"=",[1192,3005,3006],{"class":1304},"large-v3",[1192,3008,3009],{"class":1197},"   # best accuracy\n",[1192,3011,3012],{"class":1194,"line":92},[1192,3013,3014],{"class":1197},"# WHISPER_ASR_MODEL=medium   # faster\n",[1192,3016,3017],{"class":1194,"line":103},[1192,3018,3019],{"class":1197},"# WHISPER_ASR_MODEL=base     # fastest\n",[612,3021,3022],{},"The core tier works without it because Open WebUI has a built-in CPU Whisper fallback but the dedicated service is noticeably faster. To be honest: I haven't tried it out yet. I'm not used to talk to my computer yet.",[1142,3024,551],{"id":3025},"comfyui-local-image-generation",[612,3027,3028,3033,3034,3037,3038,3041,3042,3045],{},[617,3029,3032],{"href":3030,"rel":3031},"https://github.com/comfyanonymous/ComfyUI",[657],"ComfyUI"," handles local Stable Diffusion inference. Drop any ",[677,3035,3036],{},".safetensors"," checkpoint into ",[677,3039,3040],{},"data/comfyui/models/"," and it's immediately available. Supports SDXL, SD 1.5, FLUX, and anything else you throw at it. The overlay starts ComfyUI with the ",[677,3043,3044],{},"--lowvram"," flag by default, which helps on 8 GB cards.",[612,3047,3048],{},"For cloud image generation, LiteLLM routes to Azure GPT Image 1.5 or Azure AI Foundry FLUX.2-pro. Pick your model in the Open WebUI settings.",[1142,3050,556],{"id":3051},"langfuse-observability",[612,3053,3054,3059],{},[617,3055,3058],{"href":3056,"rel":3057},"https://github.com/langfuse/langfuse",[657],"Langfuse"," receives a trace for every LLM call that passes through LiteLLM. The dashboard gives you input/output text, latency per model, token counts and cost per call, per-user breakdowns, and error rates with retry patterns.",[612,3061,3062],{},"This is invaluable when something behaves unexpectedly. You can replay the exact call, see the full prompt, and compare how different models respond. The stack includes ClickHouse as an analytics backend so trace queries stay fast even with thousands of entries.",[612,3064,3065],{},"Both Langfuse and ClickHouse are optional but once you see the possibilities it is good to keep them running. You get a much better understanding of the inner workings of the process of LLMs.",[1142,3067,561],{"id":3068},"open-notebook-document-research",[612,3070,3071,3076],{},[617,3072,3075],{"href":3073,"rel":3074},"https://github.com/lfnovo/open-notebook",[657],"Open Notebook"," is a self-hosted alternative to Google NotebookLM. Upload PDFs, web pages, or text files and have the LLM answer questions across them. It connects to LiteLLM, so it uses the same model pool as your chat.",[612,3078,3079],{},"This is where the stack really shines for work: meeting transcripts, architecture docs, long reports can be used and indexed without sending data to public providers. I'm researching the OpenWebUI RAG functionality though to see if this is still needed.",[645,3081,566],{"id":3082},"extras-tier",[1142,3084,570],{"id":3085},"caddy-https-reverse-proxy",[612,3087,3088,3093,3094,3097],{},[617,3089,3092],{"href":3090,"rel":3091},"https://caddyserver.com/",[657],"Caddy 2"," proxies every service behind a single domain and handles TLS automatically via Let's Encrypt. Going from ",[677,3095,3096],{},"localhost"," to a public domain is one variable:",[713,3099,3101],{"className":2904,"code":3100,"language":2906,"meta":449,"style":449},"# .env\nCADDY_DOMAIN=ai.example.com\n",[677,3102,3103,3108],{"__ignoreMap":449},[1192,3104,3105],{"class":1194,"line":50},[1192,3106,3107],{"class":1197},"# .env\n",[1192,3109,3110,3113,3115],{"class":1194,"line":56},[1192,3111,3112],{"class":1267},"CADDY_DOMAIN",[1192,3114,3003],{"class":2768},[1192,3116,3117],{"class":1304},"ai.example.com\n",[612,3119,3120],{},"Caddy reads this, configures HTTPS with a valid certificate, and handles renewals automatically.",[645,3122,575],{"id":3123},"the-databases",[612,3125,3126],{},"The stack uses four data stores, each picked for a specific reason:",[1070,3128,3129,3142],{},[1073,3130,3131],{},[1076,3132,3133,3136,3139],{},[1079,3134,3135],{},"Database",[1079,3137,3138],{},"Used by",[1079,3140,3141],{},"Purpose",[1089,3143,3144,3157,3169,3182],{},[1076,3145,3146,3151,3154],{},[1094,3147,3148],{},[767,3149,3150],{},"PostgreSQL 16",[1094,3152,3153],{},"LiteLLM, Langfuse",[1094,3155,3156],{},"Primary data store",[1076,3158,3159,3164,3166],{},[1094,3160,3161],{},[767,3162,3163],{},"Redis 7",[1094,3165,1165],{},[1094,3167,3168],{},"Response caching, rate limiting",[1076,3170,3171,3177,3179],{},[1094,3172,3173,3174],{},"C",[767,3175,3176],{},"lickHouse 24",[1094,3178,3058],{},[1094,3180,3181],{},"High-volume analytics traces",[1076,3183,3184,3189,3191],{},[1094,3185,3186],{},[767,3187,3188],{},"SeaweedFS",[1094,3190,3058],{},[1094,3192,3193],{},"S3-compatible object storage for media and events",[612,3195,3196,3197,3200],{},"All data lands in ",[677,3198,3199],{},"./data/"," on the host. Everything survives container restarts and updates.",[645,3202,79],{"id":660},[1142,3204,583],{"id":3205},"prerequisites",[685,3207,3208,3211,3220,3223],{},[688,3209,3210],{},"Docker ≥ 24.0 and Docker Compose ≥ 2.20",[688,3212,3213,3214,3219],{},"NVIDIA GPU + ",[617,3215,3218],{"href":3216,"rel":3217},"https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html",[657],"NVIDIA Container Toolkit"," (GPU tier only)",[688,3221,3222],{},"16 GB RAM minimum for core; 32 GB recommended with GPU tier",[688,3224,3225],{},"GPU with at least 8GB of VRAM",[3227,3228,3230,3235,3270,3299,3303,3398,3401,3407,3411,3477,3481,3484,3488,3495,3506,3547,3554,3571,3574,3578,3581,3585,3588,3644,3651,3655,3658,3689,3692,3701,3705,3708,3711,3738],"steps",{"level":3229},"3",[3231,3232,3234],"h4",{"id":3233},"_1-clone-and-configure","1. Clone and configure",[713,3236,3238],{"className":2904,"code":3237,"language":2906,"meta":449,"style":449},"git clone https://github.com/jdgoeij/CustomAIChat.git\ncd CustomAIChat\ncp .env.example .env\n",[677,3239,3240,3251,3259],{"__ignoreMap":449},[1192,3241,3242,3245,3248],{"class":1194,"line":50},[1192,3243,3244],{"class":2913},"git",[1192,3246,3247],{"class":1304}," clone",[1192,3249,3250],{"class":1304}," https://github.com/jdgoeij/CustomAIChat.git\n",[1192,3252,3253,3256],{"class":1194,"line":56},[1192,3254,3255],{"class":1280},"cd",[1192,3257,3258],{"class":1304}," CustomAIChat\n",[1192,3260,3261,3264,3267],{"class":1194,"line":92},[1192,3262,3263],{"class":2913},"cp",[1192,3265,3266],{"class":1304}," .env.example",[1192,3268,3269],{"class":1304}," .env\n",[612,3271,3272,3273,3276,3277,3280,3281,3284,3285,3288,3289,2659,3292,2659,3295,3298],{},"Open ",[677,3274,3275],{},".env"," and fill in your secrets. Every variable has an inline comment. At minimum you need ",[677,3278,3279],{},"POSTGRES_PASSWORD"," and ",[677,3282,3283],{},"REDIS_PASSWORD"," (strong random strings), a ",[677,3286,3287],{},"LITELLM_MASTER_KEY"," (the API key all clients use), Langfuse auth secrets (",[677,3290,3291],{},"LANGFUSE_SECRET_KEY",[677,3293,3294],{},"LANGFUSE_PUBLIC_KEY",[677,3296,3297],{},"LANGFUSE_SALT","), and your Azure OpenAI or OpenAI credentials if you want cloud models from day one.",[3231,3300,3302],{"id":3301},"_2-start-the-stack","2. Start the stack",[3304,3305,3306,3322,3382],"tabs",{},[3307,3308,3311,3314],"tabs-item",{"icon":3309,"label":3310},"i-lucide-server","Core",[612,3312,3313],{},"Core only (no GPU required)",[713,3315,3316],{"className":2746,"code":2760,"language":2748,"meta":449,"style":449},[677,3317,3318],{"__ignoreMap":449},[1192,3319,3320],{"class":1194,"line":50},[1192,3321,2760],{"class":1267},[3307,3323,3326,3329,3362,3365],{"icon":3324,"label":3325},"i-lucide-cpu","GPU (selective)",[612,3327,3328],{},"Start core, then add individual GPU services:",[713,3330,3332],{"className":2746,"code":3331,"language":2748,"meta":449,"style":449},".\\scripts\\start.ps1 up core\n.\\scripts\\start.ps1 gpu-start ollama    # local LLMs\n.\\scripts\\start.ps1 gpu-start whisper   # speech-to-text\n",[677,3333,3334,3338,3350],{"__ignoreMap":449},[1192,3335,3336],{"class":1194,"line":50},[1192,3337,2760],{"class":1267},[1192,3339,3340,3342,3344,3347],{"class":1194,"line":56},[1192,3341,2765],{"class":1267},[1192,3343,2769],{"class":2768},[1192,3345,3346],{"class":1267},"start ollama    ",[1192,3348,3349],{"class":1197},"# local LLMs\n",[1192,3351,3352,3354,3356,3359],{"class":1194,"line":92},[1192,3353,2765],{"class":1267},[1192,3355,2769],{"class":2768},[1192,3357,3358],{"class":1267},"start whisper   ",[1192,3360,3361],{"class":1197},"# speech-to-text\n",[612,3363,3364],{},"Switch between heavy services on 8 GB GPUs:",[713,3366,3368],{"className":2746,"code":3367,"language":2748,"meta":449,"style":449},".\\scripts\\start.ps1 gpu-switch comfyui  # stops Ollama, starts ComfyUI\n",[677,3369,3370],{"__ignoreMap":449},[1192,3371,3372,3374,3376,3379],{"class":1194,"line":50},[1192,3373,2765],{"class":1267},[1192,3375,2769],{"class":2768},[1192,3377,3378],{"class":1267},"switch comfyui  ",[1192,3380,3381],{"class":1197},"# stops Ollama, starts ComfyUI\n",[3307,3383,3386,3389],{"icon":3384,"label":3385},"i-lucide-layers","All",[612,3387,3388],{},"Everything including Caddy (needs ≥16 GB VRAM)",[713,3390,3392],{"className":2746,"code":3391,"language":2748,"meta":449,"style":449},".\\scripts\\start.ps1 up all\n",[677,3393,3394],{"__ignoreMap":449},[1192,3395,3396],{"class":1194,"line":50},[1192,3397,3391],{"class":1267},[612,3399,3400],{},"Once all containers are healthy, you'll see:",[713,3402,3405],{"className":3403,"code":3404,"language":718,"meta":449},[716],"✅ Open WebUI       → http://localhost:3000\n✅ Langfuse         → http://localhost:3001\n✅ Open Notebook    → http://localhost:3002\n✅ LiteLLM API      → http://localhost:4000\n✅ SearXNG          → http://localhost:8080\n",[677,3406,3404],{"__ignoreMap":449},[3231,3408,3410],{"id":3409},"_3-first-run-checklist","3. First-run checklist",[685,3412,3415,3428,3438,3450,3459,3468],{"className":3413},[3414],"contains-task-list",[688,3416,3419,3423,3424,3427],{"className":3417},[3418],"task-list-item",[3420,3421],"input",{"disabled":1017,"type":3422},"checkbox"," Open WebUI at ",[677,3425,3426],{},":3000"," — register your admin account (first registration wins)",[688,3429,3431,3433,3434,3437],{"className":3430},[3418],[3420,3432],{"disabled":1017,"type":3422}," Langfuse at ",[677,3435,3436],{},":3001"," — create your organisation and generate an API key pair",[688,3439,3441,3443,3444,3446,3447],{"className":3440},[3418],[3420,3442],{"disabled":1017,"type":3422}," Paste Langfuse keys back into ",[677,3445,3275],{}," and restart: ",[677,3448,3449],{},".\\scripts\\start.ps1 up core",[688,3451,3453,3455,3456],{"className":3452},[3418],[3420,3454],{"disabled":1017,"type":3422}," Pull a local model: ",[677,3457,3458],{},"docker exec ai-ollama ollama pull gemma4",[688,3460,3462,3464,3465],{"className":3461},[3418],[3420,3463],{"disabled":1017,"type":3422}," Or the lighter variant for 8 GB GPUs: ",[677,3466,3467],{},"docker exec ai-ollama ollama pull gemma4:e2b",[688,3469,3471,3473,3474],{"className":3470},[3418],[3420,3472],{"disabled":1017,"type":3422}," Drop Stable Diffusion checkpoints into ",[677,3475,3476],{},"data/comfyui/models/checkpoints/",[3231,3478,3480],{"id":3479},"_4-configuring-open-webui","4. Configuring Open WebUI",[612,3482,3483],{},"Once everything is running, Open WebUI needs to know about SearXNG and your image generation backend. Neither works out of the box — but both are quick to set up.",[3231,3485,3487],{"id":3486},"_5-web-search-with-searxng","5. Web search with SearXNG",[612,3489,3490,3491,3494],{},"Open WebUI talks to SearXNG over HTTP and expects JSON responses. The Docker Compose stack already handles networking between the containers, but SearXNG ships with JSON output disabled by default. Without it, Open WebUI gets HTML back and throws a ",[677,3492,3493],{},"403 Forbidden"," error.",[612,3496,3497,3498,3501,3502,3505],{},"First, make sure SearXNG has started at least once so it generates its config files. Then edit ",[677,3499,3500],{},"data/searxng/settings.yml"," and add ",[677,3503,3504],{},"json"," to the formats list:",[713,3507,3509],{"className":1185,"code":3508,"language":1188,"meta":449,"style":449},"# data/searxng/settings.yml\nsearch:\n  formats:\n    - html\n    - json    # required for Open WebUI\n",[677,3510,3511,3516,3523,3530,3538],{"__ignoreMap":449},[1192,3512,3513],{"class":1194,"line":50},[1192,3514,3515],{"class":1197},"# data/searxng/settings.yml\n",[1192,3517,3518,3521],{"class":1194,"line":56},[1192,3519,3520],{"class":1263},"search",[1192,3522,1268],{"class":1267},[1192,3524,3525,3528],{"class":1194,"line":92},[1192,3526,3527],{"class":1263},"  formats",[1192,3529,1268],{"class":1267},[1192,3531,3532,3535],{"class":1194,"line":103},[1192,3533,3534],{"class":1267},"    - ",[1192,3536,3537],{"class":1304},"html\n",[1192,3539,3540,3542,3544],{"class":1194,"line":1215},[1192,3541,3534],{"class":1267},[1192,3543,3504],{"class":1304},[1192,3545,3546],{"class":1197},"    # required for Open WebUI\n",[612,3548,3549,3550,3553],{},"Restart SearXNG after this change. Then in Open WebUI, go to ",[767,3551,3552],{},"Admin Panel → Settings → Web Search"," and configure:",[685,3555,3556,3563],{},[688,3557,3558,1277,3561],{},[767,3559,3560],{},"Web Search Engine",[677,3562,2635],{},[688,3564,3565,1277,3568],{},[767,3566,3567],{},"SearXNG Query URL",[677,3569,3570],{},"http://searxng:8080/search?q=\u003Cquery>",[612,3572,3573],{},"That's it. The globe icon in chat now triggers a private web search. Toggle it per message — it's not on by default.",[3231,3575,3577],{"id":3576},"_6-image-generation","6. Image generation",[612,3579,3580],{},"Open WebUI supports multiple image generation backends. Which one you configure depends on whether you're running the GPU tier (ComfyUI for local generation) or using cloud models through LiteLLM.",[3231,3582,3584],{"id":3583},"_7-option-a-cloud-image-generation-via-litellm","7. Option A: Cloud image generation via LiteLLM",[612,3586,3587],{},"If you have Azure GPT Image 1.5 or another OpenAI-compatible image model configured in LiteLLM, point Open WebUI to LiteLLM's API:",[3589,3590,3591,3597,3604,3614,3622,3630,3636],"ol",{},[688,3592,3593,3594],{},"Go to ",[767,3595,3596],{},"Admin Panel → Settings → Images",[688,3598,3599,3600,3603],{},"Toggle ",[767,3601,3602],{},"Image Generation"," on",[688,3605,3606,3607,3610,3611],{},"Set ",[767,3608,3609],{},"Image Generation Engine"," to ",[677,3612,3613],{},"OpenAI",[688,3615,3606,3616,3610,3619],{},[767,3617,3618],{},"API Base URL",[677,3620,3621],{},"http://litellm:4000/v1",[688,3623,3606,3624,3627,3628],{},[767,3625,3626],{},"API Key"," to your ",[677,3629,3287],{},[688,3631,3632,3633,3635],{},"Enter the model name exactly as it appears in your LiteLLM config (e.g. ",[677,3634,1735],{},")",[688,3637,3606,3638,3610,3641],{},[767,3639,3640],{},"Image Size",[677,3642,3643],{},"1024x1024",[612,3645,3646,3647,3650],{},"For Azure specifically, make sure your LiteLLM config uses API version ",[677,3648,3649],{},"2025-04-01-preview"," or later because older versions don't support the required parameters.",[3231,3652,3654],{"id":3653},"_7-option-b-local-generation-with-comfyui","7. Option B: Local generation with ComfyUI",[612,3656,3657],{},"If you're running the GPU tier with ComfyUI:",[3589,3659,3660,3664,3668,3674,3682],{},[688,3661,3593,3662],{},[767,3663,3596],{},[688,3665,3599,3666,3603],{},[767,3667,3602],{},[688,3669,3606,3670,3610,3672],{},[767,3671,3609],{},[677,3673,3032],{},[688,3675,3606,3676,3610,3679],{},[767,3677,3678],{},"ComfyUI Base URL",[677,3680,3681],{},"http://comfyui:8188",[688,3683,3684,3685,3688],{},"Import your workflow JSON (exported from ComfyUI in ",[767,3686,3687],{},"API Format"," — not the standard save)",[612,3690,3691],{},"The API Format export is important: in ComfyUI, enable \"Dev mode Options\" in settings first, then use \"Save (API Format)\" from the menu. The standard JSON export won't work.",[612,3693,3694,3695,3697,3698,3700],{},"Drop your ",[677,3696,3036],{}," checkpoints into ",[677,3699,3476],{}," and they appear immediately. No restart needed.",[3231,3702,3704],{"id":3703},"_8-keeping-image-models-out-of-the-chat-selector","8. Keeping image models out of the chat selector",[612,3706,3707],{},"Once you've configured the image backend, you'll notice the image models show up in the main model selector alongside your chat models. That's not ideal and you don't want to accidentally start a conversation with an image-only model.",[612,3709,3710],{},"The trick is to hide the image models from the selector but still make them available for in-chat image generation. Here's how:",[3589,3712,3713,3721,3728],{},[688,3714,3593,3715,3718,3719,3635],{},[767,3716,3717],{},"Workspace → Models"," and find your image model (e.g. ",[677,3720,1735],{},[688,3722,3723,3724],{},"Disable or hide the model so it no longer appears in the model dropdown:\n",[621,3725],{"alt":3726,"src":3727},"disable-model","/images/blog/ai-stack/model-configuration-1.png",[688,3729,3730,3731,3733,3734],{},"Then edit each chat model you want to use for image generation — open its settings and enable the ",[767,3732,3602],{}," capability\n",[621,3735],{"alt":3736,"src":3737},"configure-model","/images/blog/ai-stack/model-configuration-2.png",[612,3739,3740],{},"Now when you select a chat model like GPT-5.4-mini, an image generation button appears in the chat input. You stay in your conversation, click the button, type a prompt, and the image is generated using the backend you configured without ever leaving the chat or switching models. Text and images stay in one flow, just like you are used to in ChatGPT.",[645,3742,588],{"id":3743},"image-examples",[713,3745,3748],{"className":3746,"code":3747,"language":718,"meta":449},[716],"Create an image: An ominous robot overlord in a futuristic control room, surrounded by glowing monitors, holographic interfaces, and banks of surveillance cameras, watching over a vast city through large windows. The scene is cinematic and dramatic, with a cold blue and red color palette, subtle fog, towering machinery, and a sense of technological surveillance and AI dominance. The robot is large, sleek, and intimidating, but clearly fictional and non-human. Highly detailed, realistic sci-fi concept art, moody lighting, wide composition.\n",[677,3749,3747],{"__ignoreMap":449},[612,3751,3752,3753],{},"Result:\n",[621,3754],{"alt":3755,"src":3756},"ai-overlord","/images/blog/ai-stack/ai-overlord.png",[612,3758,3759],{},"Or something else:",[713,3761,3764],{"className":3762,"code":3763,"language":718,"meta":449},[716],"Create an image: A vibrant technical welcome scene for OpenWebUI running in a personal Docker Compose stack, available to everyone. Futuristic neon color palette with glowing cyan, magenta, purple, and electric blue accents. Show a sleek containerized infrastructure: Docker Compose YAML panels, modular service blocks, network lines, server racks, and an AI chat interface labeled OpenWebUI at the center. The mood is happy, welcoming, modern, and community-friendly. Clean high-tech UI elements, holographic displays, subtle circuit patterns, depth, and soft neon bloom. Highly detailed, cinematic lighting, professional tech illustration, sharp lines, glossy surfaces, and a premium cyberpunk-but-accessible aesthetic.\n",[677,3765,3763],{"__ignoreMap":449},[612,3767,3768],{},"Result 2:",[612,3770,3771],{},[621,3772],{"alt":3773,"src":3774},"openwebui","/images/blog/ai-stack/openwebui.png",[645,3776,593],{"id":3777},"common-pitfalls",[612,3779,3780,3783,3784,3787],{},[767,3781,3782],{},"No models in Open WebUI?"," LiteLLM probably hasn't connected yet. Check ",[677,3785,3786],{},"docker logs litellm"," — a single bad API key will silently skip that model on startup.",[612,3789,3790,3793,3794,3797,3798,3801],{},[767,3791,3792],{},"SearXNG returning 403?"," The ",[677,3795,3796],{},"SEARXNG_SECRET_KEY"," must be set before first boot. If you changed it after, delete ",[677,3799,3800],{},"data/searxng/"," and restart.",[612,3803,3804,3807,3808,2659,3811,2663,3813,3815],{},[767,3805,3806],{},"Langfuse not receiving traces?"," LiteLLM needs ",[677,3809,3810],{},"LANGFUSE_HOST",[677,3812,3294],{},[677,3814,3291],{},". Restart LiteLLM after setting them and verify under Traces in the dashboard.",[612,3817,3818,3821,3822,3825],{},[767,3819,3820],{},"GPU services ignoring the GPU?"," Confirm the NVIDIA Container Toolkit works: ",[677,3823,3824],{},"docker run --gpus all nvidia/cuda:12.0-base nvidia-smi",". If that fails, the toolkit isn't installed correctly.",[612,3827,3828,3831,3832,3834,3835,3837],{},[767,3829,3830],{},"VRAM out of memory?"," On 8 GB GPUs, don't run Ollama and ComfyUI at the same time. Use ",[677,3833,2666],{}," to swap between them. If Gemma 4 E4B is too tight, try the E2B variant (",[677,3836,3467],{},").",[612,3839,3840,3843,3844,3846],{},[767,3841,3842],{},"Caddy certificate failures?"," Your domain must be publicly reachable on ports 80 and 443 for Let's Encrypt. Use ",[677,3845,3096],{}," for local-only setups.",[645,3848,598],{"id":3849},"whats-next",[612,3851,3852,3853,3855,3856,3858,3859,3861],{},"The stack is intentionally modular — start with ",[677,3854,1098],{},", get comfortable with the UI and model routing, then add GPU services individually when the hardware is ready. On an 8 GB GPU, the ",[677,3857,2658],{}," / ",[677,3860,2666],{}," commands let you use every feature without running out of VRAM.",[612,3863,3864,3865,3870],{},"Most of the interesting customisation lives in the LiteLLM config. The ",[617,3866,3869],{"href":3867,"rel":3868},"https://docs.litellm.ai/docs/routing",[657],"routing docs"," cover fallback chains (Azure → Ollama on quota errors), per-model rate limits, and budget enforcement per user.",[612,3872,3873],{},"And whenever you want to know exactly what a model said, what it cost, and how long it took then Langfuse already has the answer.",[612,3875,3876],{},"If this post pushed you to try running a local model — start with Gemma 4. Pull it, ask it something, and see for yourself. The gap between local and cloud is shrinking fast.",[3878,3879,3880],"style",{},"html pre.shiki code .sCsY4, html code.shiki .sCsY4{--shiki-light:#6A737D;--shiki-default:#6A737D;--shiki-dark:#6A737D}html pre.shiki code .sByVh, html code.shiki .sByVh{--shiki-light:#22863A;--shiki-default:#85E89D;--shiki-dark:#85E89D}html pre.shiki code .slsVL, html code.shiki .slsVL{--shiki-light:#24292E;--shiki-default:#E1E4E8;--shiki-dark:#E1E4E8}html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .so5gQ, html code.shiki .so5gQ{--shiki-light:#D73A49;--shiki-default:#F97583;--shiki-dark:#F97583}html pre.shiki code .shcOC, html code.shiki .shcOC{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0}",{"title":449,"searchDepth":56,"depth":56,"links":3882},[3883,3884,3885,3890,3898,3901,3902,3905,3906,3907],{"id":647,"depth":56,"text":124},{"id":1065,"depth":56,"text":507},{"id":1140,"depth":56,"text":512,"children":3886},[3887,3888,3889],{"id":1144,"depth":92,"text":516},{"id":1158,"depth":92,"text":521},{"id":2628,"depth":92,"text":526},{"id":2645,"depth":56,"text":531,"children":3891},[3892,3893,3894,3895,3896,3897],{"id":2832,"depth":92,"text":536},{"id":2853,"depth":92,"text":541},{"id":2965,"depth":92,"text":546},{"id":3025,"depth":92,"text":551},{"id":3051,"depth":92,"text":556},{"id":3068,"depth":92,"text":561},{"id":3082,"depth":56,"text":566,"children":3899},[3900],{"id":3085,"depth":92,"text":570},{"id":3123,"depth":56,"text":575},{"id":660,"depth":56,"text":79,"children":3903},[3904],{"id":3205,"depth":92,"text":583},{"id":3743,"depth":56,"text":588},{"id":3777,"depth":56,"text":593},{"id":3849,"depth":56,"text":598},"2026-04-02",{},{"title":42,"description":500},[1029,1026,3912,3913,1165,3914,3915],"Docker","OpenWebUI","Privacy","DevOps","f0cGWbxm8j5fRHa8ZetwDaQX-luCYQVDTlKVHHG_ze8",{"id":3918,"title":26,"audience":1009,"body":3919,"canonical":1009,"cover":4073,"cta":1009,"date":4074,"description":4075,"extension":1014,"locale":1015,"meta":4076,"navigation":1017,"outcome":1009,"path":27,"problem":1009,"readingTime":1227,"seo":4077,"stem":28,"tags":4078,"translationOf":1009,"updatedAt":1009,"__hash__":4082,"_score":56},"blog/blog/from-hugo-to-nuxt-vibe-coding.md",{"type":605,"value":3920,"toc":4065},[3921,3924,3927,3930,3933,3936,3939,3942,3945,3948,3951,3954,3957,3960,3963,3966,3969,3972,3986,3989,3992,3995,3998,4001,4004,4032,4034,4037,4051,4054,4057],[612,3922,3923],{},"Last year I was running this blog on Hugo. It was fine. Hugo is fast, reliable, and battle-tested. I have nothing bad to say about it. But over time, I kept running into a wall. I wanted to be able to spice things up a bit with theming, animations/transitions and other features I didn't know I wanted (looking at you RSS feed). I found myself fighting the framework rather than building with it because of this.",[612,3925,3926],{},"Over the last year I tried the vibe code my new blog on various occasions. The last time was October 2025. Although it brought me further, I still was correcting a lot of output. But then Opus 4.6 (and now Sonnet 4.6) hit the market. What a difference, everything changed. I wanted to try it again. With this website as a result.",[645,3928,314],{"id":3929},"what-vibe-coding-actually-means-to-me",[612,3931,3932],{},"I want to be clear about what I mean by vibe coding, because it gets thrown around a lot. For me, it is not about blindly pasting AI output and hoping for the best. It is about having a fast, creative back-and-forth with a model where I describe what I want, in plain language or by pointing at code, and the model helps me realize it. I stay in control. I understand what lands in the codebase. But the friction between \"idea\" and \"working thing\" drops dramatically.",[612,3934,3935],{},"For that to work well, I wanted a framework that has a lot of online presence and the model knows deeply. Hugo is a niche static site generator. It has its own templating language, its own directory conventions, its own quirks. When I asked a model to help me extend something in Hugo, I spent a lot of time correcting misunderstandings. The model knew Go templates and Hugo's data pipeline at a surface level, at best.",[612,3937,3938],{},"Vue and Nuxt? The models know those inside out. Every pattern, every composable, every Tailwind class. The conversation just flows a lot better.",[645,3940,319],{"id":3941},"why-nuxt-specifically",[612,3943,3944],{},"I considered a few options. Next.js was an obvious candidate since React is everywhere and models are very strong with it. But I have always preferred Vue's approach to component design. The single-file component format, the reactivity model, the way templates stay readable. It suits how I think.",[612,3946,3947],{},"Nuxt builds on Vue and fills in everything you need for a real content site: file-based routing, server routes, auto-imports, a content layer built around Markdown. It is not a toy framework. Companies ship production applications with it. That maturity matters, because it means the patterns I learn and the things I build are not throwaway experiments. They are transferable.",[612,3949,3950],{},"The Nuxt Content module in particular was the deciding factor. My posts are Markdown files, and they always will be. Nuxt Content treats them as a first-class data source. I can query posts, filter by tag, sort by date, and render MDC components inside Markdown, all without reaching for a CMS or a third-party API.",[645,3952,324],{"id":3953},"the-migration",[612,3955,3956],{},"Migrating the actual content was straightforward. Hugo and Nuxt both expect Markdown with YAML frontmatter, so my posts moved over without changes beyond a few field name adjustments.",[612,3958,3959],{},"The real work was building the site itself: the layout, navigation, search, tag pages, and RSS feed. And this is exactly where vibe coding paid off.",[612,3961,3962],{},"In Hugo it would cost me a lot more time. In Nuxt, I described what I wanted, iterated in short loops with AI assistance, and had something I was proud of within a weekend. Not every suggestion landed perfectly. There were moments where I needed to read the Nuxt docs or dig into how a composable actually worked. But that is a healthy part of the process. I understand this codebase. I just built it faster than I ever could have on my own.",[612,3964,3965],{},"It's true what they say: understanding a language is easier than speaking it. You could say the same about programming languages. I understand variables, arrays, loops, if/else statements. But in every languague you have to get to know the syntax properly before you can start flying. With my Copilot, I found this part to be particularly fast-tracked.",[645,3967,329],{"id":3968},"what-changes-when-you-use-a-mature-framework",[612,3970,3971],{},"There is an underappreciated advantage to using a framework with a large ecosystem: the guard rails are already built. Nuxt handles code splitting, hydration, SEO meta, image optimization, and TypeScript out of the box or with a single module install. I do not have to invent solutions to problems that have already been solved a hundred times.",[612,3973,3974,3975,2659,3978,3981,3982,3985],{},"This matters even more when working with GenAI. When I ask for help with something in Nuxt, the model can suggest an idiomatic solution, one that fits the framework's conventions. In a niche tool, the model improvises. In Nuxt, it suggests ",[677,3976,3977],{},"useAsyncData",[677,3979,3980],{},"definePageMeta",", a ",[677,3983,3984],{},"server/routes/"," file. Things that actually exist and work the way they are supposed to.",[612,3987,3988],{},"The result is that my blog is now more capable than it ever was on Hugo. It has live search across all post content, tag filtering, a proper RSS feed, dark and light mode, and responsive design. The code is clean enough that I can keep extending it with confidence. Now the only thing that is missing is... Content.",[645,3990,334],{"id":3991},"exploring-genai-as-a-daily-tool",[612,3993,3994],{},"I want to be honest: I am a Cloud Architect by trade, not a frontend developer. JavaScript frameworks are not my primary home. What surprised me most about this project is how much I learned by doing it this way. When the model explained why a particular reactive pattern works in Vue, or suggested a server route instead of a client-side fetch, I paid attention. I looked things up. I built a working mental model.",[612,3996,3997],{},"GenAI is at its best when it accelerates genuine learning rather than bypassing it. If I had just accepted every code block without reading it, I would have a site I could not maintain. Instead I have a site I understand well enough to keep improving, and a framework I am now genuinely comfortable with.",[612,3999,4000],{},"That feels like the right way to use these tools.",[612,4002,4003],{},"My approach was simple:",[685,4005,4006,4009,4012,4015],{},[688,4007,4008],{},"Start anew with an empty Git repo.",[688,4010,4011],{},"Don't let AI build you scaffold: build it yourself following official documentation.",[688,4013,4014],{},"When I had the starter website working and running, I commit this code. This is now my baseline.",[688,4016,4017,4018],{},"From here on out I started iterating:\n",[685,4019,4020,4023,4026,4029],{},[688,4021,4022],{},"First I set the theme colors. Is it to my liking? Commit!",[688,4024,4025],{},"Then I started working on the various pages. Commit!",[688,4027,4028],{},"Menu bar. Commit!",[688,4030,4031],{},"etc.",[645,4033,339],{"id":3849},[612,4035,4036],{},"Now that the foundation is solid, I want to keep pushing on what a personal tech blog can be. A few things I am thinking about:",[685,4038,4039,4042,4045,4048],{},[688,4040,4041],{},"Reading progress indicator on long posts",[688,4043,4044],{},"Related posts suggestions based on tag overlap",[688,4046,4047],{},"Newsletter signup without a third-party service, handled by a Nuxt server route",[688,4049,4050],{},"Automated post metadata, meaning generating descriptions and reading time during build",[612,4052,4053],{},"All of these are things I would not have touched on Hugo (although provided out of the box) In Nuxt, with good tooling and GenAI on my side, they feel totally within control.",[612,4055,4056],{},"If you are sitting on a static site generator that is starting to feel limiting, I would encourage you to take a serious look at Nuxt. The migration effort is real but manageable, and what you get on the other side is a full-stack web framework backed by a huge ecosystem, paired with the most capable AI coding tools we have ever had. That is genuinely exciting.",[612,4058,4059,4060,680],{},"Happy to answer questions. Find me on ",[617,4061,4064],{"href":4062,"rel":4063},"https://www.linkedin.com/in/jaapdegoeij/",[657],"LinkedIn",{"title":449,"searchDepth":56,"depth":56,"links":4066},[4067,4068,4069,4070,4071,4072],{"id":3929,"depth":56,"text":314},{"id":3941,"depth":56,"text":319},{"id":3953,"depth":56,"text":324},{"id":3968,"depth":56,"text":329},{"id":3991,"depth":56,"text":334},{"id":3849,"depth":56,"text":339},"/images/blog/from-hugo-to-nuxt/cover.png","2026-03-03","How switching from Hugo to Nuxt opened the door to vibe coding with GenAI and why a mature framework makes all the difference when you want to build, explore, and experiment fast.",{},{"title":26,"description":4075},[4079,4080,1029,1027,4081],"Nuxt","Hugo","WebDev","UaqdfluRvxNVUiY1rsemQECBYRV5i1Rs-cxUK-tQonQ",{"id":4084,"title":14,"audience":4085,"body":4086,"canonical":1009,"cover":4135,"cta":4411,"date":4413,"description":121,"extension":1014,"locale":1015,"meta":4414,"navigation":1017,"outcome":4415,"path":15,"problem":4416,"readingTime":1233,"seo":4417,"stem":16,"tags":4418,"translationOf":1009,"updatedAt":1009,"__hash__":4423,"_score":50},"blog/blog/branch-manager-azure-devops.md","Cloud Engineers, DevOps practitioners, and anyone managing Azure DevOps at scale",{"type":605,"value":4087,"toc":4400},[4088,4090,4105,4108,4111,4114,4121,4124,4127,4130,4136,4139,4145,4148,4151,4182,4188,4191,4194,4205,4216,4219,4222,4225,4228,4231,4234,4293,4295,4298,4301,4337,4344,4356,4359,4366,4373,4376,4379,4386,4389,4392,4397],[608,4089,124],{"id":647},[612,4091,4092,4093,4096,4097,4100,4101,4104],{},"Every team I have worked with has the same problem at some point. You open Azure DevOps, navigate to a repository, and there are 200 branches listed. Half of them are from features that shipped two years ago. A handful are from developers who left the company. A few have names like ",[677,4094,4095],{},"test-fix-final-v3"," and nobody knows what they were for. Some commit message are cryptic as well. What would ",[677,4098,4099],{},"xyz"," or ",[677,4102,4103],{},"*"," mean?",[612,4106,4107],{},"The Azure DevOps portal is excellent for many things. Branch cleanup is not one of them. You can delete branches one at a time from the repository view. A tedious process if you need vigorous cleaning. There is no easy way to filter branches by age across all repos in a project, select a batch, and remove them in one go. If you are managing more than a handful of repositories, the manual process gets old quickly.",[612,4109,4110],{},"I kept meaning to write a script for it. I never quite did. Then I decided to build something slightly more permanent.",[645,4112,129],{"id":4113},"the-problem-with-branch-clutter",[612,4115,4116,4117,4120],{},"Stale branches are not just an aesthetic issue. They create real noise. When a developer runs ",[677,4118,4119],{},"git branch -r"," or opens the branch selector in a PR, they are scrolling past dozens of dead ends. It slows down onboarding, because new team members cannot tell which branches are active and which are relics. It complicates repository hygiene at scale, especially when you have tens of repositories in a project.",[612,4122,4123],{},"The other problem is safety. You do not want to bulk-delete branches without knowing what you are removing. Some branches have active pipelines. Some protect long-running release tracks. Any bulk cleanup tool needs to handle that clearly.",[645,4125,134],{"id":4126},"presenting",[612,4128,4129],{},"Branch Manager!",[612,4131,4132],{},[621,4133],{"alt":4134,"src":4135},"Branch Manager Light Mode","/images/blog/branch-manager-azure-devops/branch-manager-login-light-mode.png",[612,4137,4138],{},"And even in dark mode!",[612,4140,4141],{},[621,4142],{"alt":4143,"src":4144},"Branch Manager Dark Mode","/images/blog/branch-manager-azure-devops/branch-manager-login-dark-mode.png",[612,4146,4147],{},"Branch Manager is a self-hosted web application. You run it locally or host it on Azure App Service, point it at your Azure DevOps organization, sign in, and get a filterable table of every branch across all repositories in a project.",[612,4149,4150],{},"From there you can:",[685,4152,4153,4160,4163,4166,4169,4179],{},[688,4154,4155,4156,4159],{},"Filter by repository, branch name, and age so you can target ",[677,4157,4158],{},"feature/"," branches older than 90 days, for example",[688,4161,4162],{},"Sort by last commit date or author",[688,4164,4165],{},"See who last touched a branch and what the last commit message was",[688,4167,4168],{},"Protect branches automatically: any branch with an Azure DevOps policy attached to it is highlighted and locked from deletion, so you cannot accidentally remove a protected default branch",[688,4170,4171,4172,2659,4175,4178],{},"Add custom protection patterns, useful for protecting ",[677,4173,4174],{},"release/",[677,4176,4177],{},"hotfix/",", or any prefix your team uses",[688,4180,4181],{},"Select and delete in bulk, with a confirmation dialog that shows you exactly what is about to go",[612,4183,4184],{},[621,4185],{"alt":4186,"src":4187},"Branch Manager Logged In","/images/blog/branch-manager-azure-devops/branch-manager-logged-in.png",[645,4189,139],{"id":4190},"authentication-two-modes",[612,4192,4193],{},"Branch Manager supports two ways to authenticate against Azure DevOps.",[612,4195,4196,4197,4200,4201,4204],{},"The first is a ",[767,4198,4199],{},"Personal Access Token",". This is the quickest option if you are running it for yourself. No app registration needed. You enter your organization name and a PAT with ",[677,4202,4203],{},"Code.ReadWrite"," permissions, and you are in.",[612,4206,4207,4208,4211,4212,4215],{},"The second is ",[767,4209,4210],{},"Microsoft Entra ID",". This is the recommended option if you want to host Branch Manager for a team. You register a single-page application in Entra, grant it the ",[677,4213,4214],{},"user_impersonation"," permission on Azure DevOps, and your colleagues can sign in with their work account through the standard Microsoft login flow. This prevents the use of shared secrets and avoids using PATs altogether. Because everyone signs in with their own account you have an audit trail as well.",[612,4217,4218],{},"One important note: Entra ID authentication for Azure DevOps requires a work or school account. Personal Microsoft accounts do not work here. That is a Microsoft restriction, not something Branch Manager can change.",[645,4220,144],{"id":4221},"how-it-was-built",[612,4223,4224],{},"I must admin: this is a vibe coding project. The first version was a PowerShell script and although it worked -barely- it was 335 lines of something I did not want to maintain. So I let Github Copilot rebuilt it as a proper Node.js web app.",[612,4226,4227],{},"The backend is Express. It proxies requests between the browser and the Azure DevOps REST API and handles authentication, rate limiting, and the branch lookup and delete operations. The frontend is plain HTML, CSS, and vanilla JavaScript without a framework. There is no build step and no bundler on the client side because I wanted a lightweight application. It is simple a new GUI for the DevOps API.",[1142,4229,149],{"id":4230},"lessons-learned",[612,4232,4233],{},"That sounds nice and all, but my git history tells a different story. Let me share my 6 biggest bumps in the road:",[3589,4235,4236,4243,4250,4268,4283,4286],{},[688,4237,4238,4239,4242],{},"I had to rewrite the Authentication part twice. The first attempt used MSAL Node on the server side, which meant managing the OAuth code flow server-side and dealing with session state. How it worked? I don't know because I yolo'd Copilot to do it. Soon I discovered iot worked in theory but added too much complexity for a personal tool. I scrapped it and started over with ",[677,4240,4241],{},"msal-browser",", which acquires the Entra ID access token entirely in the browser using PKCE. The server never sees a client secret and never stores a token. Much simpler. And with examples!",[688,4244,4245,4246,4249],{},"Azure DevOps does not return 401 errors when a token is rejected. It returns a 302 redirect to a sign-in page. That sounds like a minor detail but it completely changes how you detect auth failures. A normal ",[677,4247,4248],{},"response.ok"," check passes on a 302. You get back an HTML login page instead of JSON and the error surfaces somewhere downstream in a confusing way. I had to add explicit handling for all redirect status codes and map them to a useful error message.",[688,4251,4252,4253,4256,4257,4260,4261,3280,4264,4267],{},"Helmet's Content Security Policy blocked MSAL's CDN. Helmet ships with a default CSP that locks down most external script sources. MSAL Browser loads from ",[677,4254,4255],{},"alcdn.msauth.net",", makes token requests to ",[677,4258,4259],{},"login.microsoftonline.com",", and needs those origins in ",[677,4262,4263],{},"scriptSrc",[677,4265,4266],{},"connectSrc"," respectively. None of those are in Helmet's defaults. Easy to fix once you understand what is happening, but the browser console errors were not immediately obvious about which policy rule was blocking what.",[688,4269,4270,4271,4274,4275,4278,4279,4282],{},"Helmet's ",[677,4272,4273],{},"crossOriginOpenerPolicy"," breaks popup window communication. This one took longer. The default value ",[677,4276,4277],{},"same-origin"," prevents the opener page from reading the popup's location after it navigates. That is exactly the mechanism MSAL popup flow depends on. Setting it to ",[677,4280,4281],{},"same-origin-allow-popups"," fixed it, but it is not a setting you would think to check first.",[688,4284,4285],{},"Tokens were appearing in request logs. The Express request logger I added for troubleshooting was faithfully printing every URL, including OAuth redirects that carry authorization codes and access tokens as query parameters. I added a sanitization step that redacts those parameters before logging. It is a small thing but it matters if logs end up in any kind of monitoring system.",[688,4287,4288,4289,4292],{},"The Azure DevOps REST API surface for branches is fairly large. The refs endpoint, the commit details endpoint, the branch stats endpoint, and the batch delete operation all behave slightly differently and the documentation has some gaps. Copilot was genuinely useful here. It could reason about the response shapes and suggest the right request format for things like the batch delete, which expects an array of ref update objects with ",[677,4290,4291],{},"newObjectId"," set to forty zeros to signal deletion. That is not something I would have guessed, but Copilot brought me the answers.",[645,4294,154],{"id":660},[612,4296,4297],{},"Alright, let's get to the interesting part: Installation!",[612,4299,4300],{},"You need Node.js 18 or higher and (of course) an Azure DevOps organization. Clone the repo, install dependencies, and start the server:",[713,4302,4304],{"className":2904,"code":4303,"language":2906,"meta":449,"style":449},"git clone https://github.com/jdgoeij/BranchManager.git\ncd BranchManager/server\nnpm install\nnpm start\n",[677,4305,4306,4315,4322,4330],{"__ignoreMap":449},[1192,4307,4308,4310,4312],{"class":1194,"line":50},[1192,4309,3244],{"class":2913},[1192,4311,3247],{"class":1304},[1192,4313,4314],{"class":1304}," https://github.com/jdgoeij/BranchManager.git\n",[1192,4316,4317,4319],{"class":1194,"line":56},[1192,4318,3255],{"class":1280},[1192,4320,4321],{"class":1304}," BranchManager/server\n",[1192,4323,4324,4327],{"class":1194,"line":92},[1192,4325,4326],{"class":2913},"npm",[1192,4328,4329],{"class":1304}," install\n",[1192,4331,4332,4334],{"class":1194,"line":103},[1192,4333,4326],{"class":2913},[1192,4335,4336],{"class":1304}," start\n",[612,4338,4339,4340,4343],{},"The app opens at ",[677,4341,4342],{},"http://localhost:8080",". For PAT authentication you are ready to go immediately. Just generate a Code Read and Write PAT and use is.",[612,4345,4346,4347,4352,4353,680],{},"For Entra ID, follow the configuration steps in the ",[617,4348,4351],{"href":4349,"rel":4350},"https://github.com/jdgoeij/BranchManager",[657],"README"," to register the app and add your credentials to ",[677,4354,4355],{},"server/.env",[645,4357,159],{"id":4358},"hosting-it-for-your-team",[612,4360,4361,4362,4365],{},"If you want to make Branch Manager available to your whole team, Azure App Service is the simplest option. The ",[677,4363,4364],{},"server/"," folder is a self-contained Node.js app and deploys directly. The README covers three paths: Azure CLI for the fastest setup, the VS Code Azure App Service extension if you prefer a UI, and a GitHub Actions workflow if you want automated deployments on every push to main.",[612,4367,4368,4369,4372],{},"Make sure you add your App Service URL as a redirect URI in your Entra app registration and set ",[677,4370,4371],{},"REDIRECT_URI"," as an environment variable on the App Service. Without this, the OAuth redirect after sign-in will not work. The README walks through exactly what to set.",[645,4374,164],{"id":4375},"what-is-next",[612,4377,4378],{},"A few things are on my list.",[612,4380,4381,4382,4385],{},"The branch table currently loads one project at a time. I want to add a cross-project view so you can see stale branches across your entire organization in one pass. This is a larger API surface but the foundation is already there. I noticed there is a ",[677,4383,4384],{},"/api/all-branches"," endpoint on the server that does exactly this.",[612,4387,4388],{},"I also want to add a CSV export. Sometimes the right action is not deletion but a review with the team first. Being able to export the filtered branch list with last commit info and committer makes that conversation easier.",[612,4390,4391],{},"If you run into something that does not work or have a feature in mind, open an issue on GitHub. The codebase is straightforward enough that contributions are very welcome.",[612,4393,4059,4394,680],{},[617,4395,4064],{"href":4062,"rel":4396},[657],[3878,4398,4399],{},"html pre.shiki code .shcOC, html code.shiki .shcOC{--shiki-light:#6F42C1;--shiki-default:#B392F0;--shiki-dark:#B392F0}html pre.shiki code .sfrk1, html code.shiki .sfrk1{--shiki-light:#032F62;--shiki-default:#9ECBFF;--shiki-dark:#9ECBFF}html pre.shiki code .suiK_, html code.shiki .suiK_{--shiki-light:#005CC5;--shiki-default:#79B8FF;--shiki-dark:#79B8FF}html .light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html.light .shiki span {color: var(--shiki-light);background: var(--shiki-light-bg);font-style: var(--shiki-light-font-style);font-weight: var(--shiki-light-font-weight);text-decoration: var(--shiki-light-text-decoration);}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}",{"title":449,"searchDepth":56,"depth":56,"links":4401},[4402,4403,4404,4405,4408,4409,4410],{"id":4113,"depth":56,"text":129},{"id":4126,"depth":56,"text":134},{"id":4190,"depth":56,"text":139},{"id":4221,"depth":56,"text":144,"children":4406},[4407],{"id":4230,"depth":92,"text":149},{"id":660,"depth":56,"text":154},{"id":4358,"depth":56,"text":159},{"id":4375,"depth":56,"text":164},{"label":4412,"url":4349},"View Branch Manager on GitHub","2026-03-05",{},"A self-hosted Node.js web app that connects to Azure DevOps, lets you filter and review branches across all repos, and deletes them in bulk with a single confirmation.","Azure DevOps has no built-in UI for bulk branch cleanup across multiple repositories, leaving teams to deal with hundreds of stale branches manually.",{"title":14,"description":121},[4419,4420,4421,1027,4422],"Azure DevOps","Azure","Tools","Git","qJBxciYwjPCt_Gdh0K7OXpDDUcxH9oZLfrtML8ehSEw",1776367711669]