Skip to content

dlgforge.pipeline.dedup

Question normalization and dedup registry helpers.

RunQuestionRegistry(initial_questions=None)

Helper for runquestionregistry.

Parameters:

Name Type Description Default
initial_questions Iterable[str] | None

Iterable[str] | None value used by this operation.

None

Raises:

Type Description
Exception

Construction may raise when required dependencies or inputs are invalid.

Side Effects / I/O: - Primarily performs in-memory transformations.

Preconditions / Invariants: - Instantiate and use through documented public methods.

Examples:

>>> from dlgforge.pipeline.dedup import RunQuestionRegistry
>>> RunQuestionRegistry(...)

filter_and_commit(candidates) async

Filter and commit.

Parameters:

Name Type Description Default
candidates Sequence[Tuple[int, str]]

Sequence[Tuple[int, str]] value used by this operation.

required

Returns:

Type Description
Tuple[Set[int], Set[int]]

Tuple[Set[int], Set[int]]: Value produced by this API.

Raises:

Type Description
Exception

Propagates unexpected runtime errors from downstream calls.

Side Effects / I/O: - Primarily performs in-memory transformations.

Preconditions / Invariants: - Callers should provide arguments matching annotated types and expected data contracts.

Examples:

>>> from dlgforge.pipeline.dedup import RunQuestionRegistry
>>> instance = RunQuestionRegistry(...)
>>> instance.filter_and_commit(...)

normalize_question(text)

Normalize question.

Parameters:

Name Type Description Default
text str

Input text.

required

Returns:

Name Type Description
str str

Value produced by this API.

Raises:

Type Description
Exception

Propagates unexpected runtime errors from downstream calls.

Side Effects / I/O: - Primarily performs in-memory transformations.

Preconditions / Invariants: - Callers should provide arguments matching annotated types and expected data contracts.

Examples:

>>> from dlgforge.pipeline.dedup import normalize_question
>>> normalize_question(...)