• Что бы вступить в ряды "Принятый кодер" Вам нужно:
    Написать 10 полезных сообщений или тем и Получить 10 симпатий.
    Для того кто не хочет терять время,может пожертвовать средства для поддержки сервеса, и вступить в ряды VIP на месяц, дополнительная информация в лс.

  • Пользаватели которые будут спамить, уходят в бан без предупреждения. Спам сообщения определяется администрацией и модератором.

  • Гость, Что бы Вы хотели увидеть на нашем Форуме? Изложить свои идеи и пожелания по улучшению форума Вы можете поделиться с нами здесь. ----> Перейдите сюда
  • Все пользователи не прошедшие проверку электронной почты будут заблокированы. Все вопросы с разблокировкой обращайтесь по адресу электронной почте : info@guardianelinks.com . Не пришло сообщение о проверке или о сбросе также сообщите нам.

How to Resolve Django DISTINCT ON Order By Error

Lomanu4 Оффлайн

Lomanu4

Команда форума
Администратор
Регистрация
1 Мар 2015
Сообщения
1,481
Баллы
155
When working with Django ORM, you might encounter the need to retrieve distinct records, especially when filtering on a column such as sku. However, you may run into a specific issue if you aim to sort by another column, like id. This article will cover why this error arises and how you can solve it effectively.

Understanding the Problem


The error you encountered, django.db.utils.ProgrammingError: SELECT DISTINCT ON expressions must match initial ORDER BY expressions, occurs because of the specific SQL syntax requirements when using DISTINCT ON() in combination with an ORDER BY clause. When you apply DISTINCT ON, the PostgreSQL database expects that the column specified in DISTINCT ON appears at the beginning of the ORDER BY clause. Because your query orders by id after using DISTINCT ON(sku), PostgreSQL throws this error.

Example Data Structure


Let's assume you have a Django model that looks like this:

class Product(models.Model):
sku = models.CharField(max_length=255)
id = models.AutoField(primary_key=True)


Your goal is to achieve the distinct sku values alongside their relationships to the id. For this, let’s see how you can re-structure your query.

Step-by-Step Solution

Step 1: Distinct Query with Correct Ordering


The key to resolving this issue is to ensure that your ORDER BY clause matches the DISTINCT ON clause. You should start by retrieving distinct sku values and then use a subquery if needed to further sort by id.

Applying a Subquery


Here’s how to do it:

from django.db.models import OuterRef, Subquery, F

# First, create a subquery that retrieves distinct SKU values.
subquery = Product.objects.filter(sku=OuterRef('sku')).order_by('id')
queryset = Product.objects.distinct('sku').annotate(min_id=Subquery(subquery.values('id')[:1]))
queryset = queryset.order_by('min_id')

Explanation of the Query

  1. Subquery Creation: The subquery filters the Product model based on matching SKUs and orders them by id. We denote OuterRef('sku') to reference the outer queryset's sku column.
  2. Distinct Declaration: Using .distinct('sku') allows us to select unique sku values.
  3. Annotation: We then annotate each distinct object with a minimum id associated with that sku from the subquery.
  4. Final Ordering: Finally, we use .order_by('min_id') to sort the output based on the minimum id for each distinct sku.
Final SQL Query


This Django ORM structure translates into an SQL query somewhat like this:

SELECT DISTINCT ON ("defapp_log"."sku") "defapp_log".*
FROM "defapp_log"
ORDER BY "defapp_log"."sku", "defapp_log"."id";

Conclusion


By restructuring your query to correctly use PostgreSQL’s DISTINCT ON along with the ORDER BY clause, you're able to avoid the ProgrammingError. This allows you to retrieve distinct records efficiently while retaining the desired sorting based on id.

Frequently Asked Questions

What databases support DISTINCT ON syntax?


PostgreSQL is the primary database that supports the DISTINCT ON syntax. It is not supported in MySQL or SQLite, so be cautious of your database type when using this feature.

Can I achieve the same result without using DISTINCT ON?


Yes, you could alternatively utilize grouping, however, that might not always yield the same efficiency as using DISTINCT ON, especially based on your specific data layout.

Will this work for large data sets?


Using subqueries along with DISTINCT ON may lead to performance hits on large datasets, so always consider running performance tests as your data grows.


Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

 
Вверх Снизу