The ChatGPT Consolidation Test

Niv Nissenson
Jul 15, 2025
3 min read

Can ChatGPT handle basic accounting consolidations?

Following up on our previous analysis of ChatGPT's US GAAP knowledge, I decided to put the latest GPT-4o model through another practical test that every financial executive deals with: accounting consolidations. The results were... illuminating.

I asked ChatGPT if it can do a simple accounting consolidation to which it replied:

"Yes, absolutely" it replied - confident as ever!

I deliberately kept this test straightforward to give ChatGPT the best chance of success. Here's what I presented:

Two simple companies: Company C (parent) and Company D (subsidiary)

Company C owns 100% of Company D

Basic Balance Sheet and P&L statements for both entities

No intercompany transactions (in order not to complicate matters)

Clear labeling of "Investment in Company D" on the parent's balance sheet

The AI did ask some reasonable clarifying questions that gave me hope it understood the fundamentals of consolidation accounting. The questions suggested it was thinking through the process methodically.

ChatGPT's consolidated P&L was technically correct, though it included multiple zero lines that cluttered the presentation. When I asked for a classical P&L format, it cleaned this up adequately. So far, so good.

The Balance Sheet is where ChatGPT stumbled significantly. Despite the clear consolidation requirements, it made two fundamental errors and an annoyance:

Failed to eliminate the "Investment in Company D" asset - even though it was explicitly labeled
Failed to eliminate the subsidiary's equity - a basic consolidation requirement
It used it's own formatting which wasn't consistent with what I provided

The result was a balance sheet that double-counted the investment in subsidiary asset and equity, inflating the consolidated totals.

Format change was annoying with lots of "zero" lines.

Ok, so ChatGPT needed some refreshers about how to consolidate - fair enough. So I explicitly asked ChatGPT to remove the investment line. And it actually made things worse. Now the AI produced a balance sheet that didn't balance, and also complained that the subsidiary's liabilities were missing (they weren't - they were clearly in the original data).

It was time to "snowplow" the way for the AI accountant so I reattached the spreadsheets then explicitly told the chat: "Please eliminate the investment in subsidiary and reconcile the consolidated equity."

ChatGPT acknowledged its oversight and made a hopeful statement about getting it right this time. However, it then inflated the balance sheet to $420K, apparently by aggregating line totals incorrectly.

With another failure I pivoted. If the AI is sloppy on excels perhaps if I appeal to its logic and reasoning it'll figure it out. So I asked it "How did you get to $420K in total assets?" and then viola - it realized it's mistake (blamed it on sloppy aggregations) and finally delivered and accurate balance sheet which was initially in a weird format, but after a couple of iterations with requests for classic balance sheet formatting it achieved the desired result.

Key Takeaways for Financial Executives:

This test revealed several patterns:

1. Format Sensitivity & error propagation

ChatGPT struggled with the classical financial statement format I provided. When the AI made formatting changes, it introduced new errors (like aggregating totals). This highlights the risk of AI "improvements" that actually make things worse.

2. Lack of Transparency

The AI didn't show formulas or calculations in its Excel outputs, making it difficult to trace where errors originated - a significant issue for audit trails and error correction.

3. Knowledge vs. Application Gap

ChatGPT technically "knew" what needed to be done (eliminating investment/subsidiary equity) but failed to apply this knowledge until explicitly prompted. This suggests a gap between theoretical understanding and practical application.

The Broader Implications

This simple consolidation test - something a first-year accounting student should handle - revealed that we're still far from AI replacing controller-level functions. The technology shows promise but lacks the intuitive understanding and reliability that financial professionals require.

However, this experience reinforced two critical insights:

Data Quality Remains King: The age-old principle of "garbage in, garbage out" applies fully to AI. Companies with clean, well-structured financial data will extract far more value from AI tools than those with messy, inconsistent information.

While ChatGPT didn't pass this basic consolidation test, the technology is evolving rapidly. The key for CFOs and financial executives is to:

Maintain realistic expectations about current AI capabilities

Invest in data quality and standardization

Develop AI literacy within finance teams

Test AI tools thoroughly before relying on them for critical processes

We will continue to test this and other models to see how they improve and develop over time.

The CFO AI All posts

The ChatGPT Consolidation Test

Recent Posts

A Finance Executive's AI Journey