My favorites | Sign in
Project Home Wiki Issues Source Code Search
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 634: Reduce memory usage of DartScanner
1 person starred this issue and may be notified of changes. Back to list
Status:  WontFix
Owner:  ----
Closed:  Apr 2013


Sign in to add a comment
 
Reported by zundel@google.com, Nov 29, 2011
When running some unit tests with heap space constrained to 32M, 2 tests run out of memory:

=== debugia32 dartc co19/LibTest/core/List/sort/List/sort/A01/t06 ===
=== debugia32 dartc co19/LibTest/core/List/sort/List/sort/A01/t05 ===

Top on the heap histogram are DartScanner.Location, DartScanner.Position and DartScanner.TokenData, with over 200K instances each adding up to 75M of the 32M heap.

1) The scanner currently tokenizes the entire file into memory, which may be over aggressive. If we do not keep references to these objects throughout the parse, we may be able to GC them if we only tokenize the file in chunks.

2) In DartScanner.Location, the code currently stores Position objects for start,end of each token.   Each Position object contains 3 integers, line # ,column #, and offset.  

We could reduce memory usage by storing only 2 integers in Location as byte offsets for start/end from the start of the file.  Then, for that source file keep an index to indicate what offset corresponds with each line number.  Since the column and line position is rarely accessed, we could use something as simple as an array and use binary search to find the right line number for a given character offset.

Nov 29, 2011
#1 zundel@google.com
We looked a bit at #1 and it is tricky due to the scanner's ability to do rollback.  We need to keep all parsed tokens in memory in case of rollback (or cleverly rewind and re-tokenize to a particular place in that case.)

As an alternative to save some memory, we noticed that the entire source is kept around.  We tried to eliminate first all references to the source in DartScanner by making new String() instances for each TokenData.value field.  That helped there, but the DartSource object also keeps around a full copy of the source.
Apr 17, 2012
Project Member #2 kasperl@google.com
(No comment was entered for this change.)
Labels: -Area-Compiler Area-Analyzer
Jun 7, 2012
Project Member #3 brianwilkerson@google.com
(No comment was entered for this change.)
Labels: Milestone-Later
Aug 28, 2012
Project Member #4 brianwilkerson@google.com
(No comment was entered for this change.)
Labels: -Milestone-Later Milestone-M3
Jan 23, 2013
Project Member #5 scheglov@google.com
(No comment was entered for this change.)
Labels: -Milestone-M3 Milestone-Later AnalysisEngine
Apr 7, 2013
#6 amouravski@google.com
(No comment was entered for this change.)
Labels: Editor-AnalysisEngine
Apr 17, 2013
Project Member #7 brianwilkerson@google.com
dartc has been deprecated in favor of the new analysis engine.
Status: WontFix
Labels: -AnalysisEngine -Editor-AnalysisEngine
May 23, 2013
Project Member #8 clayberg@google.com
(No comment was entered for this change.)
Labels: -Milestone-Later Milestone-M5
Sign in to add a comment

Powered by Google Project Hosting