[Question] Is plucking columns from joined table a legit usage?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Question] Is plucking columns from joined table a legit usage?

st0012
From the examples in documents, it looks like AR#pluck is mainly for retrieving columns from the same table. Like 
Post.pluck(:author_id)

But we can also find some people using pluck for joined tables' columns, like 
Post.joins(:author).pluck("author.name", :title)

And normally this works fine, except for cases like https://github.com/rails/rails/issues/36042. 

class Post < ActiveRecord::Base
  enum status: {
    a: "a",
    b: "b",
    c: "c"
  }
  has_many :comments
end

class Comment < ActiveRecord::Base
  enum status: {
    red: "red",
    green: "green",
    blue: "blue"
  }
  belongs_to :post
end

class BugTest < Minitest::Test
  def test_association_stuff
    post = Post.create!(status: "a")
    post.comments << Comment.create!(status: "blue", not_enum: "blue")

    expected = [["a", "blue", "blue"]]
    actual = Post.includes(:comments).pluck(:status, "comments.status", "comments.not_enum")
    assert_equal expected, actual
  end
end


After digging into the issue, I think plucking joined tables' columns is kind of a misuse of pluck. Here's the source code of pluck

    def pluck(*column_names)
      if loaded? && (column_names.map(&:to_s) - @klass.attribute_names - @klass.attribute_aliases.keys).empty?
        return records.pluck(*column_names)
      end

      if has_include?(column_names.first)
        relation = apply_join_dependency
        relation.pluck(*column_names)
      else
        klass.disallow_raw_sql!(column_names)
        relation = spawn
        relation.select_values = column_names
        result = skip_query_cache_if_necessary { klass.connection.select_all(relation.arel, nil) }
        result.cast_values(klass.attribute_types)
      end
    end


And the cause of the issue above is 

result.cast_values(klass.attribute_types)

Pluck seems assume it'll only be used for the same table, so it uses the original class' attribute types to cast column values (in this case, the Post's attribute types). Hence when the joined table's columns need different casting methods (like different enum definitions), Rails may get confused. (For generic types like integer, string that won't be a problem though, that's why most people don't have issue). And for our issue example, Rails tried to use Post's status type definition to cast Comment's status type, and as expected it failed.

So my question is, should we have a full support on this behavior? Which means we'd need to be able to cast types for every tables envolved. And this won't be a small change since we now need to get all tables' type definitions. Or we should consider this as a misuse and not receommending using it (or even deprecate it).






--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Question] Is plucking columns from joined table a legit usage?

James Coleman
At the very least on PostgreSQL (I only say that because that's where
I'm the most familiar) there's no reason to need to cast based purely
on the class's attributes -- the database result knows the database
types of the columns returns. For example, my relation_to_struct gem
does this for arbitrary calculated columns using only the ActiveRecord
APIs: https://github.com/jcoleman/relation_to_struct/blob/master/lib/relation_to_struct/active_record_relation_extension.rb#L26

I believe `pluck` should guarantee the same.

On Sat, Apr 20, 2019 at 1:28 AM Stan Lo <[hidden email]> wrote:

>
> From the examples in documents, it looks like AR#pluck is mainly for retrieving columns from the same table. Like
> Post.pluck(:author_id)
>
> But we can also find some people using pluck for joined tables' columns, like
> Post.joins(:author).pluck("author.name", :title)
>
> And normally this works fine, except for cases like https://github.com/rails/rails/issues/36042.
>
> class Post < ActiveRecord::Base
>   enum status: {
>     a: "a",
>     b: "b",
>     c: "c"
>   }
>   has_many :comments
> end
>
> class Comment < ActiveRecord::Base
>   enum status: {
>     red: "red",
>     green: "green",
>     blue: "blue"
>   }
>   belongs_to :post
> end
>
> class BugTest < Minitest::Test
>   def test_association_stuff
>     post = Post.create!(status: "a")
>     post.comments << Comment.create!(status: "blue", not_enum: "blue")
>
>     expected = [["a", "blue", "blue"]]
>     actual = Post.includes(:comments).pluck(:status, "comments.status", "comments.not_enum")
>     assert_equal expected, actual
>   end
> end
>
>
> After digging into the issue, I think plucking joined tables' columns is kind of a misuse of pluck. Here's the source code of pluck
>
>     def pluck(*column_names)
>       if loaded? && (column_names.map(&:to_s) - @klass.attribute_names - @klass.attribute_aliases.keys).empty?
>         return records.pluck(*column_names)
>       end
>
>       if has_include?(column_names.first)
>         relation = apply_join_dependency
>         relation.pluck(*column_names)
>       else
>         klass.disallow_raw_sql!(column_names)
>         relation = spawn
>         relation.select_values = column_names
>         result = skip_query_cache_if_necessary { klass.connection.select_all(relation.arel, nil) }
>         result.cast_values(klass.attribute_types)
>       end
>     end
>
>
> And the cause of the issue above is
>
> result.cast_values(klass.attribute_types)
>
> Pluck seems assume it'll only be used for the same table, so it uses the original class' attribute types to cast column values (in this case, the Post's attribute types). Hence when the joined table's columns need different casting methods (like different enum definitions), Rails may get confused. (For generic types like integer, string that won't be a problem though, that's why most people don't have issue). And for our issue example, Rails tried to use Post's status type definition to cast Comment's status type, and as expected it failed.
>
> So my question is, should we have a full support on this behavior? Which means we'd need to be able to cast types for every tables envolved. And this won't be a small change since we now need to get all tables' type definitions. Or we should consider this as a misuse and not receommending using it (or even deprecate it).
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To post to this group, send email to [hidden email].
> Visit this group at https://groups.google.com/group/rubyonrails-core.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Question] Is plucking columns from joined table a legit usage?

Alberto Almagro
In reply to this post by st0012
I've added a link in the pull request you mention so that people finding this in Github's issues view can track this.

El sábado, 20 de abril de 2019, 7:27:55 (UTC+2), Stan Lo escribió:
From the examples in documents, it looks like AR#pluck is mainly for retrieving columns from the same table. Like 
Post.pluck(:author_id)

But we can also find some people using pluck for joined tables' columns, like 
Post.joins(:author).pluck("<a href="http://author.name" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fauthor.name\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHiDtIwpBVkYaJmkoO5JvIAK9X9AQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fauthor.name\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHiDtIwpBVkYaJmkoO5JvIAK9X9AQ&#39;;return true;">author.name", :title)

And normally this works fine, except for cases like <a href="https://github.com/rails/rails/issues/36042" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Frails%2Frails%2Fissues%2F36042\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHhSM2BJC3zbxIAYHhlAUrNuo8yyg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Frails%2Frails%2Fissues%2F36042\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHhSM2BJC3zbxIAYHhlAUrNuo8yyg&#39;;return true;">https://github.com/rails/rails/issues/36042. 

class Post < ActiveRecord::Base
  enum status: {
    a: "a",
    b: "b",
    c: "c"
  }
  has_many :comments
end

class Comment < ActiveRecord::Base
  enum status: {
    red: "red",
    green: "green",
    blue: "blue"
  }
  belongs_to :post
end

class BugTest < Minitest::Test
  def test_association_stuff
    post = Post.create!(status: "a")
    post.comments << Comment.create!(status: "blue", not_enum: "blue")

    expected = [["a", "blue", "blue"]]
    actual = Post.includes(:comments).pluck(:status, "comments.status", "comments.not_enum")
    assert_equal expected, actual
  end
end


After digging into the issue, I think plucking joined tables' columns is kind of a misuse of pluck. Here's the source code of pluck

    def pluck(*column_names)
      if loaded? && (column_names.map(&:to_s) - @klass.attribute_names - @klass.attribute_aliases.keys).empty?
        return records.pluck(*column_names)
      end

      if has_include?(column_names.first)
        relation = apply_join_dependency
        relation.pluck(*column_names)
      else
        klass.disallow_raw_sql!(column_names)
        relation = spawn
        relation.select_values = column_names
        result = skip_query_cache_if_necessary { klass.connection.select_all(relation.arel, nil) }
        result.cast_values(klass.attribute_types)
      end
    end


And the cause of the issue above is 

result.cast_values(klass.attribute_types)

Pluck seems assume it'll only be used for the same table, so it uses the original class' attribute types to cast column values (in this case, the Post's attribute types). Hence when the joined table's columns need different casting methods (like different enum definitions), Rails may get confused. (For generic types like integer, string that won't be a problem though, that's why most people don't have issue). And for our issue example, Rails tried to use Post's status type definition to cast Comment's status type, and as expected it failed.

So my question is, should we have a full support on this behavior? Which means we'd need to be able to cast types for every tables envolved. And this won't be a small change since we now need to get all tables' type definitions. Or we should consider this as a misuse and not receommending using it (or even deprecate it).






--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/d/optout.