My favorites | Sign in
Project Home
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 404: Rack environment strings should have appropriate encoding and REQUEST_URI and friends should not be pre-unescaped
13 people starred this issue and may be notified of changes. Back to list
Status:  Fixed
Owner:  ----
Closed:  Feb 2011


Sign in to add a comment
 
Reported by adam.q.salter@gmail.com, Oct 26, 2009
What steps will reproduce the problem?
1. Start new rails app.
2. Configure Passenger to start this app.
3. Configure Passenger to start ruby1.9 in UTF8 encoding (as specified here: [1])
4. Attempt to load a multibyte char URL such as http://localhost/posts/你好

What is the expected output? What do you see instead?

Application outputs the following in the apache error_log:

Error during failsafe response: "\xE4" from ASCII-8BIT to UTF-8
[Mon Oct 26 18:29:50 2009] [error] [client 127.0.0.1] Premature end of script headers: 
\xe4\xbd\xa0\xe5\xa5\xbd
[ pid=2929 file=ext/apache2/Hooks.cpp:711 time=2009-10-26 18:29:50.113 ]:
  The backend application (process 2935) didn't send a valid HTTP response. It might have crashed 
during the middle of sending an HTTP response, so please check whether there are crashing 
problems in your application. This is the data that it sent: [X-Powered-By]
*** Exception NoMethodError in application (You have a nil object when you didn't expect it!
You might have expected an instance of Array.
The error occurred while evaluating nil.each) (process 2935):
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/rack/request_handler.rb:99:in `process_request'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_request_handler.rb:207:in `main_loop'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/railz/application_spawner.rb:378:in `start_request_handler'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/railz/application_spawner.rb:336:in `block in 
handle_spawn_application'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/utils.rb:183:in `safe_fork'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/railz/application_spawner.rb:334:in `handle_spawn_application'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_server.rb:352:in `main_loop'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_server.rb:196:in `start_synchronously'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_server.rb:163:in `start'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/railz/application_spawner.rb:213:in `start'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/spawn_manager.rb:262:in `block (2 levels) in spawn_rails_application'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_server_collection.rb:126:in `lookup_or_add'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/spawn_manager.rb:256:in `block in spawn_rails_application'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_server_collection.rb:80:in `block in synchronize'
	from <internal:prelude>:8:in `synchronize'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_server_collection.rb:79:in `synchronize'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/spawn_manager.rb:255:in `spawn_rails_application'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/spawn_manager.rb:154:in `spawn_application'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/spawn_manager.rb:287:in `handle_spawn_application'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_server.rb:352:in `main_loop'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-
2.2.5/lib/phusion_passenger/abstract_server.rb:196:in `start_synchronously'
	from /opt/local/lib/ruby1.9/gems/1.9.1/gems/passenger-2.2.5/bin/passenger-spawn-
server:61:in `<main>'

This works fine if I start rails using ./script/server 

What version of Phusion Passenger are you using? Which version of Rails? On
what operating system?

Rails 2.3.4
Passenger 2.2.5
Mac OSX 10.6.1
Ruby 1.9.1p243

Also small note/request:

Rails starts in UTF-8 by default on Ruby 1.9 [2], it would be nice if we didn't have to use the above 
UTF8 script proxy. A PassengerRubyEncoding setting would be appreciated (or even just allow 
config options to the PassengerRuby setting.
E.g. 
PassengerRuby /opt/local/bin/ruby1.9 -E UTF8:UTF16 -I/lib

Please provide any additional information below.

[1] https://code.google.com/p/phusion-passenger/issues/detail
id=233&can=1&colspec=ID%20Type%20Status%20Priority%20Milestone%20Stars%20Summary&start
=200#c2

[2] http://github.com/rails/rails/commit/cf3ccd7be0caae67dfddf5bc5056dcfdaab6f369
Oct 29, 2009
Project Member #1 honglilai@gmail.com
We start Ruby in its default encoding, which is UTF-8 on most systems. It's just that
the request handler part operates purely on binary data. Now it looks like something,
somewhere in Rails it's expecting UTF-8 data instead of binary, but I can't see what
part of Rails it is because the failsafe backtrace is swallowed. Could you try to
obtain the failsafe backtrace?
Oct 29, 2009
Project Member #2 honglilai@gmail.com
Actually, after playing around with Ruby 1.9, I suspect it's a bug in Ruby's encoding
conversion code. Consider the following irb session (url.txt is a file which contains
your Chinese URL):

  irb(main):005:0> data = File.open('url.txt', 'rb') { |f| f.read }
=> "http://localhost/posts/\xE4\xBD\xA0\xE5\xA5\xBD
irb(main):006:0> data.encode!('utf-8')
Encoding::UndefinedConversionError: "\xE4" from ASCII-8BIT to UTF-8
        from (irb):6:in `encode!'
        from (irb):6
        from /opt/ruby191/bin/irb:12:in `<main>'
irb(main):007:0> data = File.open('url.txt', 'r') { |f| f.read }
=> "http://localhost/posts/你好
irb(main):008:0> data.encode('binary')
Encoding::UndefinedConversionError: U+4F60 from UTF-8 to ASCII-8BIT
        from (irb):8:in `encode'
        from (irb):8
        from /opt/ruby191/bin/irb:12:in `<main>'

Ruby's unable to convert the raw binary data to UTF-8, and unable to convert UTF-8 to
its raw binary data. I don't think this is supposed to happen.
Oct 29, 2009
#3 adam.q.salter@gmail.com
Hongli,

I don't get the first error, but do get the second.

>> "http://localhost/posts/\xE4\xBD\xA0\xE5\xA5\xBD".encode!('UTF-8')
=> "http://localhost/posts/你好
>> "http://localhost/posts/\xE4\xBD\xA0\xE5\xA5\xBD".encode!('UTF-8').encode('ASCII-8BIT')
Encoding::UndefinedConversionError: "\xE4\xBD\xA0" from UTF-8 to ASCII-8BIT
	from (irb):4:in `encode'
	from (irb):4
	from /opt/local/bin/irb:12:in `<main>'

I've raised a bug on Ruby. Will let you know how it goes.

http://redmine.ruby-lang.org/issues/show/2313
Oct 30, 2009
Project Member #4 honglilai@gmail.com
From the answer in the bug tracker it looks like I did something wrong. :)

So the answer now becomes, which environment strings are supposed to be UTF-8? The
Rack spec doesn't specify this. I'll discuss this with the Rack and Rails teams.
Oct 30, 2009
#5 adam.q.salter@gmail.com
I'm also dealing with another error - in HAML - and it seems pretty odd that strings are treated differently in 
different template systems. Could you also peruse this ticket and see if it is related?

http://github.com/nex3/haml/issues/closed/#issue/3/comment/66489

What I would like to see is a Rails setting whereby you could set the 'global' encoding. So all strings into/out 
of db and inside template systems would use/expect this encoding. I'm pretty sure Rails just uses UTF-8 by 
default - which is actually fine, but at some point the strings get encoded ASCII-8BIT before output and that 
means that strings from the db are not in the default encoding for output (I think ;).
Oct 30, 2009
Project Member #6 honglilai@gmail.com
No, a global encoding would go against the idea of having strings with encoding in
the first place. That'd be like Ruby 1.8. The current problems occur because it's not
clearly defined what input should be in what encoding, most apps just assume
everything is binary.
Oct 30, 2009
#7 adam.q.salter@gmail.com
Sorry, my bad. ;)

I've been doing my own testing and I'm not really familiar with how everything should work.

I found this commit to the Rails code base [1] which essentially tries to set a global encoding, but it's since 
been removed [2].

I can see it's complicated.

[1] http://github.com/rails/rails/commit/cf3ccd7be0caae67dfddf5bc5056dcfdaab6f369
[2] http://github.com/rails/rails/blob/2-3-stable/railties/lib/initializer.rb#L428
Feb 10, 2010
Project Member #8 honglilai@gmail.com
I can't reproduce this problem anymore with Rails 2.3.5 and Ruby 1.9.2dev. Can you?
Labels: PossiblyOutdated
Feb 11, 2010
#9 niels%he...@gtempaccount.com
honglilai, I'm running Rails 2.3.5 but can't currently test against Ruby trunk. Could
you point out the commit that fixed the issue so that it can be backported?
Feb 11, 2010
Project Member #10 honglilai@gmail.com
I didn't fix anything, I just can't reproduce the problem.
Feb 11, 2010
#11 niels%he...@gtempaccount.com
I'm still seeing this issue with Rails 2.3.5, Apache 2.2.12, Passenger 2.2.9 & Ruby
ruby 1.9.2dev (2010-02-11 trunk 26647)
Feb 11, 2010
Project Member #12 honglilai@gmail.com
Does it happen with any app, or just your specific app?
Aug 18, 2010
#13 benhut...@gmail.com
I am seeing this same error, doing the same thing.  Works fine in mongrel.  Fails in passenger.  

Rails 2.3.8, Apache 2.2.14, Passenger 2.2.15, Ruby 1.9.1p378

How might I go about getting a better backtrace?  
Sep 12, 2010
#15 maeld...@gmail.com
I can reproduce the same bug with Passenger and Prototype legacy helper plugin. It only happens with passenger and only when the url contains UTF-8 string. 

'rails server -e production' works fine with the same url.

Apache + Passenger makes this:


ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT):
    10: <% end %>
    11: 
    12:                         <li class="login">
    13:                         <%= form_remote_tag :url => '/user/login', :update => 'login_result', :before => (update_page do |p| p.replace_html 'login_result', 'Próbálkozom...' end) do %>
    14:                         <%= hidden_field_tag(:uri, request.env['REQUEST_URI']) %>
    15:                         <%= hidden_field_tag(:login_type) %>
  app/views/user/_menu.rhtml:13:in `_app_views_user__menu_rhtml___1575938245245858503_70121165064720__2264297866215139544'
  app/views/layouts/base.rhtml:32:in `_app_views_layouts_base_rhtml___4273967061461057810_70121166750520__2870595627455645602'

I have tried to put '# encoding: utf-8' into all the views and the prototype helper source files, it didn't help.

$ ruby -v 
ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-linux]

$ gem list

*** LOCAL GEMS ***

abstract (1.0.0)
actionmailer (3.0.0)
actionpack (3.0.0)
activemodel (3.0.0)
activerecord (3.0.0)
activeresource (3.0.0)
activesupport (3.0.0)
arel (1.0.1)
builder (2.1.2)
bundler (1.0.0)
erubis (2.6.6)
fastthread (1.0.7)
i18n (0.4.1)
mail (2.2.5)
mime-types (1.16)
passenger (2.2.15)
pg (0.9.0)
polyglot (0.3.1)
rack (1.2.1)
rack-mount (0.6.13)
rack-test (0.5.4)
rails (3.0.0)
railties (3.0.0)
rake (0.8.7)
rmagick (2.13.1)
thor (0.14.0)
treetop (1.4.8)
tzinfo (0.3.23)

Gentoo Hardened 64, RVM, Apache2

http://mage.hu/profile/Magony%20T%C3%BCnde
http://mage.hu/profile/Mage
Sep 12, 2010
#16 maeld...@gmail.com
I found an easy way to produce this bug.

Just create a new rails application with ruby 1.9.2 and rails 3.

Generate any scaffold. I generated a scaffold named Club.

run the application, insert one item with the scaffold.

Modify the controller:

  def show
    #@club = Club.find(params[:id])
    @club = Club.find 1

Insert two lines at the end of the show.html.erb:

<%= hidden_field_tag(:uri, request.env['REQUEST_URI']) %>
<%= hidden_field_tag(:login_type, value: 'jó') %>

Restart the app.

go to show/ü (or any utf8 character).

http://localhost:3000/clubs/%C3%BC


With 'rails server -e production' it works and the page's source will contain this:

<a href="/clubs/1/edit">Edit</a> |
<a href="/clubs">Back</a>

<input id="uri" name="uri" type="hidden" value="http://localhost:3000/clubs/%C3%BC />
<input id="login_type" name="login_type" type="hidden" value="{:value=&gt;&quot;jó&quot;}" />

However, with passenger, you will see this:

http://testrails/clubs/%C3%BC


 Encoding::CompatibilityError in Clubs#show

Showing /home/mage/temp/testrails/app/views/clubs/show.html.erb where line #13 raised:

incompatible character encodings: ASCII-8BIT and UTF-8

Extracted source (around line #13):

10: <%= link_to 'Back', clubs_path %>
11: 
12: <%= hidden_field_tag(:uri, request.env['REQUEST_URI']) %>
13: <%= hidden_field_tag(:login_type, value: 'jó') %>

Sep 12, 2010
Project Member #17 honglilai@gmail.com
Thanks for the reproduction case. I now have a clear view of the problem.

The encoding situation is very messy. Phusion Passenger is not the only one at fault, if one can even say that anyone at all is at fault. The Rack specification doesn't mandate any encoding requirements, and it would appear that the Rails helpers don't try to convert encodings. I'm not sure whether they should. I've started a discussion topic at the Rack mailing list here to discuss the problem and potential solutions. http://groups.google.com/group/rack-devel/browse_thread/thread/76694078a926e768
Sep 13, 2010
Project Member #18 honglilai@gmail.com
Looks like part of the problem comes from the fact that Nginx unescapes the URI before passing it to Phusion Passenger. I guess that needs to be fixed.
Sep 13, 2010
Project Member #19 honglilai@gmail.com
Looks like part of the problem comes from the fact that Nginx unescapes the URI before passing it to Phusion Passenger. I guess that needs to be fixed.
Labels: -Priority-Medium -PossiblyOutdated Priority-High Milestone-3.0.1
Oct 1, 2010
#20 interfa...@gmail.com
I had some issues with UTF-8 characters in ruby code and what I did was to add "-Ku" to the ruby command in the rvm wrapper called by passenger.
Maybe it won't help you, but I thought I would mention it...
Oct 10, 2010
Project Member #21 honglilai@gmail.com
 Issue 510  has been merged into this issue.
Nov 15, 2010
Project Member #22 honglilai@gmail.com
(No comment was entered for this change.)
Summary: Rack environment strings should have appropriate encoding and REQUEST_URI and friends should not be pre-unescaped
Nov 15, 2010
Project Member #23 honglilai@gmail.com
 Issue 409  has been merged into this issue.
Nov 15, 2010
Project Member #24 honglilai@gmail.com
 Issue 447  has been merged into this issue.
Nov 15, 2010
Project Member #25 honglilai@gmail.com
 Issue 374  has been merged into this issue.
Nov 15, 2010
Project Member #26 honglilai@gmail.com
 Issue 344  has been merged into this issue.
Dec 9, 2010
#27 scott%ca...@gtempaccount.com
Was this fix supposed to be in 3.0.1? I'm still getting unescaped urls for special chars in query string using nginx/passenger 3.0.1 & ruby 1.9.2r0/rails 3.0.3.  

Example: 

nginx/passenger => query string from browser "q-c++ web developer", 
is sent to action as "q-c+++web+developer"

using webrick => query string from browser "q-c++ web developer", 
is sent to action as "q-c%2B%2B+web+developer" => this is how I would expect it to come back after one cgi escape.

As long as you don't use special reserved cgi chars this bug doesn't seem to appear.

What is the best way of getting this fixed?
Dec 13, 2010
Project Member #28 honglilai@gmail.com
The fix is delayed. It looks like we have to push it to 3.0.3. Fixing this properly requires changes at the web server (Apache/Nginx) level.
Labels: -Milestone-3.0.1 Milestone-3.0.3
Dec 13, 2010
Project Member #29 honglilai@gmail.com
In particular, we have to undo the unescaping that Nginx already does for us.
Feb 17, 2011
Project Member #30 honglilai@gmail.com
Fixed in commit 5b20923.
Status: Fixed
Mar 12, 2011
#31 rickh...@gmail.com
Would it be possible to make the escaping of REQUEST_URI done in Passenger 3.0.3 a configurable option? In my case, I want the original unescaped URI (e.g. contains %20%2F and such). The escaping causes problems when the URI contains an escaped '/' character.

For example, if I want a search term to be 'him/her':
http://localhost/search/term/him%2Fher

The escaping done in Passenger makes the URI look like:
http://localhost/search/term/him/her

Which makes our routing impossible. We ran into this problem when switching from Mongrel to Passenger. Thanks for looking at this!
Mar 12, 2011
Project Member #32 honglilai@gmail.com
You mean you want the original *escaped* URI, not the unescaped URI. We already re-rescape the URI but there's no way to get the original without breaking things like mod_rewrite. If you want to be able to access %2F then you must be prepared to abandon all uses of mod_rewrite.
May 31, 2011
#33 becker.s...@gmail.com
Has this issue resurfaced? At some point this started working as expected (nginx not unescaping my URIs) - I'm guessing in 3.0.3, but now my URIs are getting unescaped by nginx again. I'm currently running nginx/1.0.0 + Phusion Passenger 3.0.7 (mod_rails/mod_rack). Maybe when nginx was upgraded to 1.0.0 it broke this?
May 30, 2013
#34 vitalie....@gmail.com
Because of escaping issue url like '/dir/%2Baer/READ+ME.txt' is converted to '/dir/+aer/READ+ME.txt'. Than impossible to detect at rails side where was '+' char or where space char.

Happens with passenger 4.0.5 too.
May 31, 2013
Project Member #35 honglilai@gmail.com
@vitalie.lazu: there is nothing more we can do about this. This is simply how the web server works. You should not depend on URL formats like that.
Jun 15, 2013
#36 rogerpack2005
as a note I think I just ran into this.  passenger+nginx http://hostname/Primary%2FYouth gets passed to rails as hostname/Primary/Youth you say there is no way around this, even using nginx?
Jun 20, 2013
Project Member #37 honglilai@gmail.com
Correct. Both Apache and Nginx convert the strings in a destructible manner. There's no way to get around this.
Sign in to add a comment

Powered by Google Project Hosting