[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

... reporting a bug... asking an SQL question (joins, order by, andgroup by)



Mckoi folks,

I think I'm reporting a bug, and I'm certainly asking an SQL question.

I've reduced the broken query down as much as I have, but still have 
some SQL to work though to demonstrate my question and what I think is a 
bug.

I'm using McKoi very happily and developing a trust network related 
application.  The application uses information from many users about the 
same topic and, among other things, shows any user a view of the data 
according to whom he/she trusts.

- The users are stating what genre different artists should be
 categorized (as in the I say Beatles are Pop, he says Rock, she says
 Oldies, etc...)

- Usernames have been replaced by "rank" (higher rank is stronger.)

For certain views of the data, I just want to show the "best" single 
answer.  This is the most recent thing that the highest ranking person 
said.  Discussion continues after entering data.

drop table if exists ranked_genre;
create table ranked_genre (
 rank int not null,
 artist varchar(30) not null,
 genre varchar(30) not null,
 created timestamp not null
);

insert into ranked_genre
 (rank, artist, genre, created)
values
 (2, 'Beastie Boys', 'Rap', current_timestamp);

insert into ranked_genre
 (rank, artist, genre, created)
values
 (1, 'Beastie Boys', 'Rock', current_timestamp);

insert into ranked_genre
 (rank, artist, genre, created)
values
 (1, 'Beastie Boys', 'Hip Hop', current_timestamp);

insert into ranked_genre
 (rank, artist, genre, created)
values
 (2, 'Beastie Boys', 'Pop', current_timestamp);

insert into ranked_genre
 (rank, artist, genre, created)
values
 (3, 'Beatles', 'Pop', current_timestamp);

insert into ranked_genre
 (rank, artist, genre, created)
values
 (2, 'Beatles', 'Rock', current_timestamp);

insert into ranked_genre
 (rank, artist, genre, created)
values
 (4, 'Beatles', 'Classic Rock', current_timestamp);

insert into ranked_genre
 (rank, artist, genre, created)
values
 (2, 'Beatles', 'Oldies', current_timestamp);



The following simple query will show me all of what's been said in
order of it's importance:

select
 *
from
 ranked_genre
order by
 artist asc,
 rank asc,
 created desc;

However, for some of the views, I want to show all artists and the best 
genre pick for each (one line per artist; the genre I want to see for 
that artist if I actually do respect the rankings).  This query results 
in no error using McKoi 1.03, but I think it should generate an error, 
and I don't know what it's doing to get the results it comes up with.   
(THE BUG.)

select
 *
from
 ranked_genre
group by
 artist
order by
 artist asc,
 rank asc,
 created desc;

Using PostgreSQL, I see this error (which seems reasonable):

ERROR: Attribute ranked_genre.rank must be GROUPed or used in an
aggregate function

Now to get the answer I want, the best I can come up with is this:

select
 ranked_genre.artist, ranked_genre.genre
from
   (select
      ranked_genre.artist, ranked_genre.rank, max(created) as max_created
    from
        (select
           artist, min(rank) as min_rank
         from
           ranked_genre
         group by
           artist) as min_ranks
      inner join
         ranked_genre on (ranked_genre.artist = min_ranks.artist and
                  ranked_genre.rank = min_ranks.min_rank)
    group by
      ranked_genre.artist, ranked_genre.rank) as correct
 inner join
   ranked_genre on (ranked_genre.artist = correct.artist and
            ranked_genre.rank = correct.rank and
            ranked_genre.created = correct.max_created);

What I'm doing in the above correct query is:

1. Use this query to figure out who the highest ranked person is who
  said anything (per artist.)

        (select
           artist, min(rank) as min_rank
         from
           ranked_genre
         group by
           artist) as min_ranks

2. Joining that to only see what that person said and figuring out
  what the most currect date is per artist.

   (select
      ranked_genre.artist, ranked_genre.rank, max(created) as max_created
    from
        (select
           artist, min(rank) as min_rank
         from
           ranked_genre
         group by
           artist) as min_ranks
      inner join
         ranked_genre on (ranked_genre.artist = min_ranks.artist and
                  ranked_genre.rank = min_ranks.min_rank)
    group by
      ranked_genre.artist, ranked_genre.rank) as correct

3. Joining that with the original data to only get the rows I want.

It works, but I'm thinking that there has to be an easier way.  Is 
there?  (THE QUESTION)


Much thanks in advance,
Tom Aviv.