ID |
Status |
Summary |
130
|
New |
I can't crawl all the pages of a site
Type-Defect
Priority-Medium
|
129
|
Fixed |
AutoMapper MappingException after update
Type-Defect
Priority-Medium
|
128
|
Fixed |
'Abot.Poco.CrawlResult' does not contain a definition for 'ErrorMessage'
Type-Defect
Priority-Medium
|
127
|
Fixed |
Using ampersand in URLs causes "Page has no content"
Type-Defect
Priority-Medium
Milestone-Release1.2.3
|
126
|
Fixed |
Github bugs/issues
Type-Defect
Priority-Medium
Milestone-Release1.2.3
|
125
|
Duplicate |
Possibility to stop crawling (and start crawling async)
Type-Enhancement
Priority-Medium
|
124
|
Fixed |
Retry crawl on webexception because of a bad proxy
Type-Defect
Priority-Medium
|
123
|
Duplicate |
Encoding issue
Type-Defect
Priority-Medium
|
122
|
WontFix |
Ability to add metadata to crawl queue
Type-Defect
Priority-Medium
|
121
|
Invalid |
Very slow
Type-Defect
Priority-Medium
|
120
|
Accepted |
Remove all marked with [Obsolete] attribute
Type-Defect
Priority-Medium
Milestone-Release2.0
|
119
|
Fixed |
MaxPagesToCrawl is broken
Type-Defect
Priority-Critical
Milestone-Release1.2.3
|
118
|
Fixed |
RobotsDotText implementation disallows all external pages
Type-Defect
Priority-Medium
Milestone-Release1.2.3
|
117
|
Invalid |
Incorrectly parsing robots.txt
Type-Defect
Priority-Medium
|
116
|
Invalid |
How to download the crawled web pages
Type-Defect
Priority-Medium
|
115
|
Invalid |
Abot.SiteSimulator.casproj can not be opened
Type-Defect
Priority-Medium
|
114
|
New |
It's great but could not save session (it's very important)
Priority-Medium
Milestone-Release2.0
Type-Feature
|
113
|
Fixed |
Converting RawContent to a byte stream and writing to disk corrupts the data
Type-Defect
Priority-Critical
Milestone-Release1.2.3
|
112
|
Fixed |
Create an auto encoding solution that detects the decoding of each page based on headers
Type-Enhancement
Priority-Medium
Milestone-Release1.2.3
|
111
|
Fixed |
When set to not crawl external links, They are still added to scheduler to be crawled and later skipped
Type-Defect
Priority-Medium
Milestone-Release1.2.3
|
110
|
Accepted |
Seperate http web requests and processing onto seperate threads
Type-Enhancement
Priority-Medium
Milestone-Release2.0
|
109
|
Duplicate |
Add "Advanced Usage" documentation
Type-Task
Priority-Medium
Milestone-Release1.2.3
|
108
|
Fixed |
Suggestion: Allow to crawl basied on an xml sitemap
Type-Defect
Priority-Medium
|
107
|
WontFix |
Suggested feature: More configurable logging w/ log4net
Type-Feature
Priority-Medium
|
106
|
Invalid |
Reference to Abot.dll is broken
Type-Defect
Priority-Medium
|
105
|
Invalid |
Memory leakage
Type-Defect
Priority-Medium
|
104
|
WontFix |
consider using ilmerge again to consolidate dependencies
Priority-Medium
Type-Enhancement
|
103
|
Fixed |
Update documentation for Abot.Core.ConfigurationSectionHandler
Type-Defect
Priority-Critical
|
102
|
Accepted |
Create code project article
Type-Task
Priority-Medium
Milestone-Release1.2.3
|
101
|
Fixed |
Add dynamic Expando object to CrawledPage
Type-Enhancement
Priority-Medium
Milestone-Release1.2
|
100
|
Fixed |
Update documentation to reflect 1.2 changes
Type-Task
Priority-Critical
Milestone-Release1.2.3
|
99
|
Fixed |
Add isForcedLinkParsingEnabled that will parse links even if the links on that page should not be crawled
Type-Enhancement
Priority-Medium
Milestone-Release1.2
|
98
|
Fixed |
Test TaskThreadManager to see if it can become the default IThreadManager
Type-Task
Priority-Medium
Milestone-Release1.2
|
97
|
Invalid |
Add proxy list to the crawler
Type-Defect
Priority-Medium
|
96
|
Fixed |
Add better cancellation to signal "hard stop"
Type-Defect
Priority-Critical
Milestone-Release1.2
|
95
|
Fixed |
Update TaskThreadManager to use the ManualResetEvent to avoid busy wait
Type-Enhancement
Priority-High
Milestone-Release1.2
|
94
|
Fixed |
SiteSimulator integration test fails on zoogle.com external link (301 instead of 200)
Type-Task
Priority-Critical
Milestone-Release1.2
|
93
|
Fixed |
If maxConcurrentThreads is 0, then use System.Environment.ProcessorCount
Type-Enhancement
Priority-Medium
Milestone-Release1.2
|
92
|
Fixed |
Support the crawling of a relatively unlimited number of sites/pages by a single crawler instance
Type-Enhancement
Priority-Medium
Milestone-Release1.2.3
|
91
|
WontFix |
Suggestion
Type-Defect
Priority-Medium
|
90
|
Accepted |
Create automated performance tests
Type-Task
Priority-Medium
Milestone-Release2.0
|
89
|
Fixed |
Create a build server
Type-Task
Priority-Medium
Milestone-Release1.2
|
88
|
Accepted |
Add javascript links parser
Priority-Medium
Milestone-Release2.0
Type-Feature
|
87
|
Accepted |
Create a nothing to something with abot video
Type-Task
Priority-Medium
Milestone-Release2.0
|
86
|
Fixed |
Demo crashes in mono
Type-Defect
Priority-Medium
Milestone-Release1.2
|
85
|
Fixed |
Consider getting back on nuget
Type-Defect
Priority-Medium
Milestone-Release1.2
|
84
|
Accepted |
Tune Abot to run well on mono
Priority-Medium
Type-Enhancement
Milestone-Release2.0
|
83
|
Fixed |
Change CrawlContext.CrawledUrls to ConcurrentDictionary for faster lookup
Type-Defect
Priority-Medium
Milestone-Release1.1
|
82
|
Fixed |
HyperLinkParser in conjunction with http redirects
Type-Defect
Priority-Medium
Milestone-Release1.1
|
81
|
Fixed |
Add MaxCrawlDepth property
Type-Defect
Priority-Medium
Milestone-Release1.1
|
80
|
Fixed |
Work on IThreadManagers
Type-Defect
Priority-High
Milestone-Release1.2
|
79
|
Accepted |
compare mono vs windows performance on apples to apples hardware
Type-Task
Priority-Medium
Milestone-Release2.0
|
78
|
Fixed |
CsQuery blows up on double encoding
Type-Defect
Priority-Medium
Milestone-Release1.1
|
77
|
Fixed |
HtmlAgilityPack throws StackOverflowException on pages with lots of nested tags
Type-Defect
Priority-High
|
76
|
Fixed |
Add IEnumerable<Uri> PageLinks
Type-Enhancement
Priority-Medium
Milestone-Release1.2
|
75
|
Fixed |
Implement robots no follow
Type-Feature
Priority-Medium
Milestone-Release1.2
|
74
|
Accepted |
Add automatic throttling
Type-Feature
Priority-Medium
Milestone-Release2.0
|
73
|
WontFix |
Create BulkCrawler that manages multiple instance of the IWebCrawler
Type-Feature
Priority-Medium
|
72
|
Fixed |
Make CrawledPage.CsQueryDocument & CrawledPage.HtmlDocument ILazy<T>
Type-Defect
Priority-High
Milestone-Release1.1
|
71
|
Fixed |
add a dynamic crawlbag so users may pass in a custom object that will be added to the crawl context
Type-Defect
Priority-Medium
Milestone-Release1.1
|
68
|
Fixed |
Make crawlconfiguration modifiable after it is loaded from app.config file
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
67
|
Fixed |
Completely remove ilmerge due to several issues
Type-Defect
Priority-Critical
Milestone-Release1.1
|
66
|
Fixed |
Remove log4net from ilmerge command
Type-Defect
Priority-High
Milestone-Release1.1
|
65
|
WontFix |
Extract postbuild commands into bat and bash files
Type-Enhancement
Priority-Medium
Milestone-Release1.2
|
64
|
WontFix |
Add IsDecisionPermanent property to CrawlDecision
Type-Enhancement
Priority-High
Milestone-Release1.1
|
63
|
Fixed |
CrawlResult.ErrorOccurred and CrawlResult.ErrorMessage are never set outside of unit tests
Type-Defect
Priority-Medium
Milestone-Release1.1
|
62
|
Fixed |
Add crawl context to event args to make them available to event subscribers
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
61
|
WontFix |
Create new log file with website name on every crawl
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
60
|
Fixed |
Add IScheduler to the crawl context so people can add urls during the crawl.
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
59
|
Fixed |
Add htmlagiliypack loaded html document to crawled page so more parsing can take place
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
58
|
Fixed |
Crawler crawls over MaxPagesToCrawl by up to X pages. X being the number of MaxConcurrentThreads
Type-Defect
Priority-Critical
Milestone-Release1.1
|
57
|
Fixed |
Update documentation to reflect 1.1 changes
Type-Task
Priority-Critical
Milestone-Release1.1
|
56
|
Fixed |
Reconsider targeting .net 4.0 so VS 2010 users can work with the source code.
Type-Task
Priority-Medium
Milestone-Release1.1
|
55
|
Fixed |
Limit memory usage for the process running Abot
Type-Defect
Priority-High
Milestone-Release1.1
|
54
|
Accepted |
Add SimulateUserClicks config value
Priority-Medium
Type-Feature
Milestone-Release2.0
|
52
|
Fixed |
Abot.Tests.Integration is not logging all library log statements
Type-Defect
Priority-Critical
Milestone-Release1.1
|
51
|
fixed |
Add config value for MaxPagesToCrawlPerDomain
Type-Feature
Priority-Medium
Milestone-Release1.1
|
50
|
WontFix |
Make Abot check its version an if less than the latest "featured" version log a message suggesting an update
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
49
|
Done |
Add lic text to each page
Type-Enhancement
Priority-Critical
Milestone-Release1.1
|
47
|
Done |
Add page for custom crawler work by hour
Type-Task
Priority-Medium
Milestone-Release1.0
|
46
|
Fixed |
Create google groups discussion
Type-Task
Priority-Critical
Milestone-Release1.0
|
45
|
Fixed |
Add constructor to webcrawler that takes only ICrawlDecisionMaker and both ICrawlDecisionMaker and CrawlConfiguration
Type-Enhancement
Priority-High
Milestone-Release1.1
|
44
|
Fixed |
Add abot version dynamically to user agent string
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
43
|
WontFix |
Think about moving unique uri crawling check/logic to IScheduler
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
42
|
Fixed |
Use concurrent collections for Scheduler and CrawlContext.CrawledUris
Type-Enhancement
Priority-Medium
Milestone-Release1.1
|
41
|
Fixed |
Implement use of isUriRecrawlingEnabled
Type-Enhancement
Priority-Medium
Milestone-Release1.0
|
40
|
Fixed |
Implement use of downloadableContentTypes config value
Type-Enhancement
Priority-Medium
Milestone-Release1.0
|
39
|
Fixed |
Implement manual crawl delay
Type-Feature
Priority-Medium
Milestone-Release1.0
|
38
|
Duplicate |
Implement crawl timeout
Type-Feature
Priority-Medium
Milestone-Release1.0
|
37
|
WontFix |
Implement crawl depth
Type-Feature
Priority-Medium
Milestone-Release1.0
|
36
|
Fixed |
Update all assemblies to 4.5
Type-Task
Priority-High
Milestone-Release1.0
|
35
|
Fixed |
Update documentation/Downloads
Type-Task
Priority-Critical
Milestone-Release1.0
|
34
|
WontFix |
Use Vs fakes to raise code coverage on untestable code
Type-Enhancement
Priority-Medium
|
33
|
Fixed |
Consider using CsQuery as the parser
Type-Enhancement
Priority-Low
Milestone-Release1.1
|
32
|
Done |
Spread the word
Type-Task
Priority-Medium
Milestone-Release1.0
|
31
|
Fixed |
Use ILMerge to create a single Abot.dll with all dependent dlls
Type-Task
Priority-Medium
Milestone-Release1.0
|
30
|
Fixed |
Create Nuget installer
Priority-Medium
Type-Task
Milestone-Release1.2.3
|
29
|
WontFix |
Add crawl recovery
Type-Feature
Priority-Medium
Milestone-Release1.1
|
28
|
Fixed |
Add crawltimeout
Type-Enhancement
Priority-High
Milestone-Release1.0
|
27
|
Fixed |
Add maxpagestocrawl check
Type-Enhancement
Priority-High
Milestone-Release1.0
|