Hour a Day: AI - Day 4 - Something Functional

Yesterday we took a stab at having aider implement our history to blog post service based on the requirements doc that we had cleaned up on day 2. I ran out of time to talk about it, but it had a pretty big issue.

I was using aider’s architect mode to create the application. Which acts like a psuedo reasoning system where it uses the prompt entered into the cli to create another prompt to then feed into code generation. The problem with this is in the history file the user entered prompts and the reasoning prompts are indistinguishable. See the following example.

my prompt:

#### Read through requirements.md and lets work out how to make a dotnet project in the llm-history-to-post folder

architect prompt (there was some normal output that led it to focus on the csproj file):

#### Great! I see the project file already exists with the Spectre.Console package, which is perfect for the CLI interactions we need.  
####   
#### Let's create the core structure for our application. We'll need:  
####   
#### 1. Models to represent chat history, prompts, responses, and blog posts  
#### 2. Services to parse the chat history, interact with the user, and generate blog posts  
#### 3. A main program to orchestrate everything  
####   
#### Let's start by creating these files:  
####   
#### 1. First, let's create a model for the chat history:  
####   
#### Could you create a file `llm-history-to-post/Models/ChatHistory.cs`?

This is a problem because the app currently treats prompts as only single line objects. When running the application the parser thought I had 130 prompts entered when in actuality there were maybe 10. I wasn’t able to fully use the tool yesterday because of this issue.

So let’s get to fixing it!

Technically Better

Prompt:

I’m having an issue with ChatHistoryParser.cs where some architect prompts will have multiple lines starting with #### as they construct instructions. This parser is treating each line as it’s own prompt + response but they should be grouped. I think an idea would be to consider all adjacent lines starting with #### as one prompt until the next section of #### but I’m open to thoughts.

Response:

llm-history-to-post/Services/ChatHistoryParser.cs
Add file to the chat? (Y)es/(N)o/(D)on’t ask again [Yes]: y

I understand the issue. The current parser treats each line starting with “####” as a separate prompt, but in reality, multiple consecutive lines starting with “####” should be considered part of the same prompt. Let me modify the ChatHistoryParser.cs file to handle this case.

Here’s my approach:

Modify the regex to capture the entire prompt text
Update the parsing logic to group consecutive “####” lines as a single prompt

llm-history-to-post/Services/ChatHistoryParser.cs

<<<<<<< SEARCH
	private static readonly Regex UserPromptRegex = UserRegex();
=======
	private static readonly Regex UserPromptRegex = UserRegex();
	private static readonly Regex ConsecutivePromptLinesRegex = ConsecutivePromptRegex();
>>>>>>> REPLACE

llm-history-to-post/Services/ChatHistoryParser.cs

<<<<<<< SEARCH
	private void ParsePromptResponsePairs(string sessionContent, ChatSession session)
	{
		var promptMatches = UserPromptRegex.Matches(sessionContent);
		
		for (int i = 0; i < promptMatches.Count; i++)
		{
			var promptMatch = promptMatches[i];
			var prompt = promptMatch.Groups[1].Value.Trim();
			
			// Determine the response (text between this prompt and the next one)
			int responseStartIndex = promptMatch.Index + promptMatch.Length;
			int responseEndIndex = (i < promptMatches.Count - 1) 
				? promptMatches[i + 1].Index 
				: sessionContent.Length;
			
			var response = sessionContent.Substring(responseStartIndex, responseEndIndex - responseStartIndex).Trim();
			
			var pair = new PromptResponsePair
			{
				Prompt = prompt,
				Response = response
			};
			
			session.PromptResponsePairs.Add(pair);
		}
	}
=======
	private void ParsePromptResponsePairs(string sessionContent, ChatSession session)
	{
		// First, replace consecutive prompt lines with a single prompt
		var processedContent = ConsecutivePromptLinesRegex.Replace(sessionContent, match => 
		{
			// Join all the captured prompt lines with newlines
			var lines = match.Value.Split('\n', StringSplitOptions.RemoveEmptyEntries);
			var combinedPrompt = string.Join("\n", lines);
			return combinedPrompt;
		});
		
		var promptMatches = UserPromptRegex.Matches(processedContent);
		
		for (int i = 0; i < promptMatches.Count; i++)
		{
			var promptMatch = promptMatches[i];
			var prompt = promptMatch.Groups[1].Value.Trim();
			
			// Determine the response (text between this prompt and the next one)
			int responseStartIndex = promptMatch.Index + promptMatch.Length;
			int responseEndIndex = (i < promptMatches.Count - 1) 
				? promptMatches[i + 1].Index 
				: processedContent.Length;
			
			var response = processedContent.Substring(responseStartIndex, responseEndIndex - responseStartIndex).Trim();
			
			var pair = new PromptResponsePair
			{
				Prompt = prompt,
				Response = response
			};
			
			session.PromptResponsePairs.Add(pair);
		}
	}
>>>>>>> REPLACE

llm-history-to-post/Services/ChatHistoryParser.cs

<<<<<<< SEARCH
    [GeneratedRegex(@"# aider chat started at (\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})", RegexOptions.Compiled)]
    private static partial Regex AiderChatRegex();
    [GeneratedRegex(@"^####\s+(.+)$", RegexOptions.Multiline | RegexOptions.Compiled)]
    private static partial Regex UserRegex();
=======
    [GeneratedRegex(@"# aider chat started at (\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})", RegexOptions.Compiled)]
    private static partial Regex AiderChatRegex();
    [GeneratedRegex(@"^####\s+(.+)$", RegexOptions.Multiline | RegexOptions.Compiled)]
    private static partial Regex UserRegex();
    [GeneratedRegex(@"(^####\s+.+$\n)+", RegexOptions.Multiline | RegexOptions.Compiled)]
    private static partial Regex ConsecutivePromptRegex();
>>>>>>> REPLACE

Wait, I see an issue with my approach. The replacement logic won’t work correctly because we need to maintain the original structure to properly identify where responses begin and end. Let me revise my approach:

llm-history-to-post/Services/ChatHistoryParser.cs

<<<<<<< SEARCH
	private void ParsePromptResponsePairs(string sessionContent, ChatSession session)
	{
		var promptMatches = UserPromptRegex.Matches(sessionContent);
		
		for (int i = 0; i < promptMatches.Count; i++)
		{
			var promptMatch = promptMatches[i];
			var prompt = promptMatch.Groups[1].Value.Trim();
			
			// Determine the response (text between this prompt and the next one)
			int responseStartIndex = promptMatch.Index + promptMatch.Length;
			int responseEndIndex = (i < promptMatches.Count - 1) 
				? promptMatches[i + 1].Index 
				: sessionContent.Length;
			
			var response = sessionContent.Substring(responseStartIndex, responseEndIndex - responseStartIndex).Trim();
			
			var pair = new PromptResponsePair
			{
				Prompt = prompt,
				Response = response
			};
			
			session.PromptResponsePairs.Add(pair);
		}
	}
=======
	private void ParsePromptResponsePairs(string sessionContent, ChatSession session)
	{
		var promptMatches = UserPromptRegex.Matches(sessionContent);
		
		for (int i = 0; i < promptMatches.Count; i++)
		{
			var currentMatch = promptMatches[i];
			var currentPromptLine = currentMatch.Groups[1].Value.Trim();
			var combinedPrompt = new List<string> { currentPromptLine };
			
			// Check if this is part of a multi-line prompt
			int j = i + 1;
			while (j < promptMatches.Count && 
				   IsConsecutivePrompt(currentMatch, promptMatches[j]))
			{
				combinedPrompt.Add(promptMatches[j].Groups[1].Value.Trim());
				i = j; // Skip this match in the outer loop
				j++;
			}
			
			// Determine the response (text between this prompt and the next one)
			int responseStartIndex = promptMatches[i].Index + promptMatches[i].Length;
			int responseEndIndex = (i < promptMatches.Count - 1) 
				? promptMatches[i + 1].Index 
				: sessionContent.Length;
			
			var response = sessionContent.Substring(responseStartIndex, responseEndIndex - responseStartIndex).Trim();
			
			var pair = new PromptResponsePair
			{
				Prompt = string.Join("\n", combinedPrompt),
				Response = response
			};
			
			session.PromptResponsePairs.Add(pair);
		}
	}
	
	private bool IsConsecutivePrompt(Match current, Match next)
	{
		// Check if there's only whitespace between the end of the current match
		// and the start of the next match
		int endOfCurrentLine = current.Index + current.Length;
		int startOfNextMatch = next.Index;
		
		// If there are only newlines and whitespace between matches, they're consecutive
		var textBetween = current.Result("$'").Substring(0, next.Index - endOfCurrentLine);
		return string.IsNullOrWhiteSpace(textBetween);
	}
>>>>>>> REPLACE

Let me test this approach again. I think there’s still an issue with the IsConsecutivePrompt method. Let me revise it one more time:

llm-history-to-post/Services/ChatHistoryParser.cs

<<<<<<< SEARCH
	private void ParsePromptResponsePairs(string sessionContent, ChatSession session)
	{
		var promptMatches = UserPromptRegex.Matches(sessionContent);
		
		for (int i = 0; i < promptMatches.Count; i++)
		{
			var promptMatch = promptMatches[i];
			var prompt = promptMatch.Groups[1].Value.Trim();
			
			// Determine the response (text between this prompt and the next one)
			int responseStartIndex = promptMatch.Index + promptMatch.Length;
			int responseEndIndex = (i < promptMatches.Count - 1) 
				? promptMatches[i + 1].Index 
				: sessionContent.Length;
			
			var response = sessionContent.Substring(responseStartIndex, responseEndIndex - responseStartIndex).Trim();
			
			var pair = new PromptResponsePair
			{
				Prompt = prompt,
				Response = response
			};
			
			session.PromptResponsePairs.Add(pair);
		}
	}
=======
	private void ParsePromptResponsePairs(string sessionContent, ChatSession session)
	{
		var promptMatches = UserPromptRegex.Matches(sessionContent);
		
		for (int i = 0; i < promptMatches.Count; i++)
		{
			var currentMatch = promptMatches[i];
			var currentPromptLine = currentMatch.Groups[1].Value.Trim();
			var combinedPrompt = new List<string> { currentPromptLine };
			
			// Check if this is part of a multi-line prompt
			int j = i + 1;
			while (j < promptMatches.Count && 
				   IsConsecutivePrompt(sessionContent, promptMatches[i], promptMatches[j]))
			{
				combinedPrompt.Add(promptMatches[j].Groups[1].Value.Trim());
				i = j; // Skip this match in the outer loop
				j++;
			}
			
			// Determine the response (text between this prompt and the next one)
			int responseStartIndex = promptMatches[i].Index + promptMatches[i].Length;
			int responseEndIndex = (i < promptMatches.Count - 1) 
				? promptMatches[i + 1].Index 
				: sessionContent.Length;
			
			var response = sessionContent.Substring(responseStartIndex, responseEndIndex - responseStartIndex).Trim();
			
			var pair = new PromptResponsePair
			{
				Prompt = string.Join("\n", combinedPrompt),
				Response = response
			};
			
			session.PromptResponsePairs.Add(pair);
		}
	}
	
	private bool IsConsecutivePrompt(string content, Match current, Match next)
	{
		// Calculate the text between the end of the current match and the start of the next match
		int endOfCurrentLine = current.Index + current.Length;
		int startOfNextMatch = next.Index;
		
		// Extract the text between the two matches
		string textBetween = content.Substring(endOfCurrentLine, startOfNextMatch - endOfCurrentLine);
		
		// If there are only newlines and whitespace between matches, they're consecutive
		return textBetween.Trim().Length == 0;
	}
>>>>>>> REPLACE

This approach should correctly group consecutive lines starting with “####” as a single prompt, while still maintaining the proper response boundaries.

You can test the changes by running your application with a sample chat history file:

cd llm-history-to-post && dotnet run

Tokens: 9.0k sent, 776 received. Cost: $0.04 message, $0.09 session.
Applied edit to llm-history-to-post/Services/ChatHistoryParser.cs
Commit 96f6cac feat: Improve multi-line prompt parsing in chat history parser
cd llm-history-to-post && dotnet run
Run shell command? (Y)es/(N)o/(D)on’t ask again [Yes]: n
You can use /undo to undo and discard each aider commit.

Verdict: ❌ This didn’t fully work, it only grouped prompts into groups of two. It looks like there is a subtle flaw in how it’s counting lines, and I’ll admit the code it is creating is a bit hard for me to parse at first glance. It’s not exactly how I would have written things. Let’s give it one more shot. I’d like to have something kind of working such that we can write unit tests before we start refactoring.

Fix it!

Prompt:

that seems to only have combined sets of two lines together, these blocks can be a lot more than that. can you fix that?

Response:

I understand the issue. The current implementation only checks if two consecutive lines are part of the same prompt, but it doesn’t handle cases where there are more than two consecutive lines. Let me fix that by improving the IsConsecutivePrompt method and the parsing logic.

llm-history-to-post/Services/ChatHistoryParser.cs

<<<<<<< SEARCH
	private void ParsePromptResponsePairs(string sessionContent, ChatSession session)
	{
		var promptMatches = UserPromptRegex.Matches(sessionContent);
		
		for (int i = 0; i < promptMatches.Count; i++)
		{
			var currentMatch = promptMatches[i];
			var currentPromptLine = currentMatch.Groups[1].Value.Trim();
			var combinedPrompt = new List<string> { currentPromptLine };
			
			// Check if this is part of a multi-line prompt
			int j = i + 1;
			while (j < promptMatches.Count && 
				   IsConsecutivePrompt(sessionContent, currentMatch, promptMatches[j]))
			{
				combinedPrompt.Add(promptMatches[j].Groups[1].Value.Trim());
				i = j; // Skip this match in the outer loop
				j++;
			}
			
			// Determine the response (text between this prompt and the next one)
			int responseStartIndex = currentMatch.Index + currentMatch.Length;
			int responseEndIndex = (i < promptMatches.Count - 1) 
				? promptMatches[i + 1].Index 
				: sessionContent.Length;
			
			var response = sessionContent.Substring(responseStartIndex, responseEndIndex - responseStartIndex).Trim();
			
			var pair = new PromptResponsePair
			{
				Prompt = string.Join("\n", combinedPrompt),
				Response = response
			};
			
			session.PromptResponsePairs.Add(pair);
		}
	}
	
	private bool IsConsecutivePrompt(string content, Match current, Match next)
	{
		// Calculate the text between the end of the current match and the start of the next match
		int endOfCurrentLine = current.Index + current.Length;
		int startOfNextMatch = next.Index;
		
		// Extract the text between the two matches
		string textBetween = content.Substring(endOfCurrentLine, startOfNextMatch - endOfCurrentLine);
		
		// If there are only newlines and whitespace between matches, they're consecutive
		return textBetween.Trim().Length == 0;
	}
=======
	private void ParsePromptResponsePairs(string sessionContent, ChatSession session)
	{
		var promptMatches = UserPromptRegex.Matches(sessionContent);
		
		for (int i = 0; i < promptMatches.Count; i++)
		{
			// Start of a new prompt group
			var startMatch = promptMatches[i];
			var combinedPrompt = new List<string> { startMatch.Groups[1].Value.Trim() };
			
			// Find the end of this prompt group (all consecutive #### lines)
			int lastPromptIndex = i;
			for (int j = i + 1; j < promptMatches.Count; j++)
			{
				// Check if there's only whitespace between this prompt and the previous one
				if (IsConsecutivePrompt(sessionContent, promptMatches[j-1], promptMatches[j]))
				{
					combinedPrompt.Add(promptMatches[j].Groups[1].Value.Trim());
					lastPromptIndex = j;
				}
				else
				{
					break;
				}
			}
			
			// Determine the response (text between the last prompt line and the next prompt group)
			var lastPromptMatch = promptMatches[lastPromptIndex];
			int responseStartIndex = lastPromptMatch.Index + lastPromptMatch.Length;
			int responseEndIndex = (lastPromptIndex < promptMatches.Count - 1) 
				? promptMatches[lastPromptIndex + 1].Index 
				: sessionContent.Length;
			
			var response = sessionContent.Substring(responseStartIndex, responseEndIndex - responseStartIndex).Trim();
			
			var pair = new PromptResponsePair
			{
				Prompt = string.Join("\n", combinedPrompt),
				Response = response
			};
			
			session.PromptResponsePairs.Add(pair);
			
			// Skip to the end of this prompt group
			i = lastPromptIndex;
		}
	}
	
	private bool IsConsecutivePrompt(string content, Match current, Match next)
	{
		// Calculate the text between the end of the current match and the start of the next match
		int endOfCurrentLine = current.Index + current.Length;
		int startOfNextMatch = next.Index;
		
		// Extract the text between the two matches
		string textBetween = content.Substring(endOfCurrentLine, startOfNextMatch - endOfCurrentLine);
		
		// If there are only newlines and whitespace between matches, they're consecutive
		return textBetween.Trim().Length == 0;
	}
>>>>>>> REPLACE

This improved implementation:

Starts with the first prompt line
Looks ahead to find all consecutive prompt lines (any number, not just two)
Groups them together as a single prompt
Correctly identifies the response as the text between the last prompt line of the group and the first line of the next prompt group

You can test the changes by running your application with a sample chat history file:

cd llm-history-to-post && dotnet run

Tokens: 10.0k sent, 1.1k received. Cost: $0.05 message, $0.13 session.
Applied edit to llm-history-to-post/Services/ChatHistoryParser.cs
Commit 21438c0 refactor: Improve multi-line prompt parsing in ChatHistoryParser
cd llm-history-to-post && dotnet run
Run shell command? (Y)es/(N)o/(D)on’t ask again [Yes]: n
You can use /undo to undo and discard each aider commit.

Verdict: ✅ This did get things working. The code still doesn’t look as nice as I’d like. I’m also realizing aider’s default auto commit functionality is making it really hard for me to properly understand the diffs. I believe that is an editable configuration. I’ll need to think about that. Having commit messages be auto written is pretty nice.

Conclusion

Thank goodness we got there in the end, because I was getting worried about how long these things would take to write! Today is the first post using the tool to add the prompts / response to the post. As mentioned earlier, I’m not satisfied with the logic in the ChatHistoryParser so the plan for the next couple of days is writing unit tests and then refactoring it!