scaleway.inference.v1 package

Submodules

scaleway.inference.v1.api module

class scaleway.inference.v1.api.InferenceV1API(client: Client, *, bypass_validation: bool = False)

Bases: API

This API allows you to handle your Managed Inference services.

Create a deployment. Create a new inference deployment related to a specific model. :param model_id: ID of the model to use. :param node_type_name: Name of the node type to use. :param endpoints: List of endpoints to create. :param region: Region to target. If none is passed will use default region from the config. :param name: Name of the deployment. :param project_id: ID of the Project to create the deployment in. :param accept_eula: If the model has an EULA, you must accept it before proceeding. The terms of the EULA can be retrieved using the GetModelEula API call. :param tags: List of tags to apply to the deployment. :param min_size: Defines the minimum size of the pool. :param max_size: Defines the maximum size of the pool. :param quantization: Quantization settings to apply to this deployment. :return: Deployment

Usage:

result = api.create_deployment(
    model_id="example",
    node_type_name="example",
    endpoints=[],
)

create_endpoint(*, deployment_id: str, endpoint: EndpointSpec, region: str | None = None) → Endpoint

Create an endpoint. Create a new Endpoint related to a specific deployment. :param deployment_id: ID of the deployment to create the endpoint for. :param endpoint: Specification of the endpoint. :param region: Region to target. If none is passed will use default region from the config. :return: Endpoint

Usage:

result = api.create_endpoint(
    deployment_id="example",
    endpoint=EndpointSpec(),
)

create_model(*, source: ModelSource, region: str | None = None, name: str | None = None, project_id: str | None = None) → Model

Import a model. Import a new model to your model library. :param source: Where to import the model from. :param region: Region to target. If none is passed will use default region from the config. :param name: Name of the model. :param project_id: ID of the Project to import the model in. :return: Model

Usage:

result = api.create_model(
    source=ModelSource(),
)

delete_deployment(*, deployment_id: str, region: str | None = None) → Deployment

Delete a deployment. Delete an existing inference deployment. :param deployment_id: ID of the deployment to delete. :param region: Region to target. If none is passed will use default region from the config. :return: Deployment

Usage:

result = api.delete_deployment(
    deployment_id="example",
)

delete_endpoint(*, endpoint_id: str, region: str | None = None) → None

Delete an endpoint. Delete an existing Endpoint. :param endpoint_id: ID of the endpoint to delete. :param region: Region to target. If none is passed will use default region from the config.

Usage:

result = api.delete_endpoint(
    endpoint_id="example",
)

delete_model(*, model_id: str, region: str | None = None) → None

Delete a model. Delete an existing model from your model library. :param model_id: ID of the model to delete. :param region: Region to target. If none is passed will use default region from the config.

Usage:

result = api.delete_model(
    model_id="example",
)

get_deployment(*, deployment_id: str, region: str | None = None) → Deployment

Get a deployment. Get the deployment for the given ID. :param deployment_id: ID of the deployment to get. :param region: Region to target. If none is passed will use default region from the config. :return: Deployment

Usage:

result = api.get_deployment(
    deployment_id="example",
)

get_deployment_certificate(*, deployment_id: str, region: str | None = None) → ScwFile

Get the CA certificate. Get the CA certificate used for the deployment of private endpoints. The CA certificate will be returned as a PEM file. :param deployment_id: :param region: Region to target. If none is passed will use default region from the config. :return: ScwFile

Usage:

result = api.get_deployment_certificate(
    deployment_id="example",
)

get_model(*, model_id: str, region: str | None = None) → Model

Get a model. Get the model for the given ID. :param model_id: ID of the model to get. :param region: Region to target. If none is passed will use default region from the config. :return: Model

Usage:

result = api.get_model(
    model_id="example",
)

List inference deployments. List all your inference deployments. :param region: Region to target. If none is passed will use default region from the config. :param page: Page number to return. :param page_size: Maximum number of deployments to return per page. :param order_by: Order in which to return results. :param project_id: Filter by Project ID. :param organization_id: Filter by Organization ID. :param name: Filter by deployment name. :param tags: Filter by tags. :return: ListDeploymentsResponse

Usage:

result = api.list_deployments()

List inference deployments. List all your inference deployments. :param region: Region to target. If none is passed will use default region from the config. :param page: Page number to return. :param page_size: Maximum number of deployments to return per page. :param order_by: Order in which to return results. :param project_id: Filter by Project ID. :param organization_id: Filter by Organization ID. :param name: Filter by deployment name. :param tags: Filter by tags. :return: List[Deployment]

Usage:

result = api.list_deployments_all()

List models. List all available models. :param region: Region to target. If none is passed will use default region from the config. :param order_by: Order in which to return results. :param page: Page number to return. :param page_size: Maximum number of models to return per page. :param project_id: Filter by Project ID. :param name: Filter by model name. :param tags: Filter by tags. :return: ListModelsResponse

Usage:

result = api.list_models()

List models. List all available models. :param region: Region to target. If none is passed will use default region from the config. :param order_by: Order in which to return results. :param page: Page number to return. :param page_size: Maximum number of models to return per page. :param project_id: Filter by Project ID. :param name: Filter by model name. :param tags: Filter by tags. :return: List[Model]

Usage:

result = api.list_models_all()

list_node_types(*, include_disabled_types: bool, region: str | None = None, page: int | None = None, page_size: int | None = None) → ListNodeTypesResponse

List available node types. List all available node types. By default, the node types returned in the list are ordered by creation date in ascending order, though this can be modified via the order_by field. :param include_disabled_types: Include disabled node types in the response. :param region: Region to target. If none is passed will use default region from the config. :param page: Page number to return. :param page_size: Maximum number of node types to return per page. :return: ListNodeTypesResponse

Usage:

result = api.list_node_types(
    include_disabled_types=False,
)

list_node_types_all(*, include_disabled_types: bool, region: str | None = None, page: int | None = None, page_size: int | None = None) → List[NodeType]

List available node types. List all available node types. By default, the node types returned in the list are ordered by creation date in ascending order, though this can be modified via the order_by field. :param include_disabled_types: Include disabled node types in the response. :param region: Region to target. If none is passed will use default region from the config. :param page: Page number to return. :param page_size: Maximum number of node types to return per page. :return: List[NodeType]

Usage:

result = api.list_node_types_all(
    include_disabled_types=False,
)

Update a deployment. Update an existing inference deployment. :param deployment_id: ID of the deployment to update. :param region: Region to target. If none is passed will use default region from the config. :param name: Name of the deployment. :param tags: List of tags to apply to the deployment. :param min_size: Defines the new minimum size of the pool. :param max_size: Defines the new maximum size of the pool. :param model_id: Id of the model to set to the deployment. :param quantization: Quantization to use to the deployment. :return: Deployment

Usage:

result = api.update_deployment(
    deployment_id="example",
)

update_endpoint(*, endpoint_id: str, region: str | None = None, disable_auth: bool | None = None) → Endpoint

Update an endpoint. Update an existing Endpoint. :param endpoint_id: ID of the endpoint to update. :param region: Region to target. If none is passed will use default region from the config. :param disable_auth: By default, deployments are protected by IAM authentication. When setting this field to true, the authentication will be disabled. :return: Endpoint

Usage:

result = api.update_endpoint(
    endpoint_id="example",
)

wait_for_deployment(*, deployment_id: str, region: str | None = None, options: WaitForOptions[Deployment, bool] | None = None) → Deployment

Get a deployment. Get the deployment for the given ID. :param deployment_id: ID of the deployment to get. :param region: Region to target. If none is passed will use default region from the config. :return: Deployment

Usage:

result = api.get_deployment(
    deployment_id="example",
)

wait_for_model(*, model_id: str, region: str | None = None, options: WaitForOptions[Model, bool] | None = None) → Model

Get a model. Get the model for the given ID. :param model_id: ID of the model to get. :param region: Region to target. If none is passed will use default region from the config. :return: Model

Usage:

result = api.get_model(
    model_id="example",
)

scaleway.inference.v1.content module

scaleway.inference.v1.content.DEPLOYMENT_TRANSIENT_STATUSES: List[DeploymentStatus] = [<DeploymentStatus.CREATING: 'creating'>, <DeploymentStatus.DEPLOYING: 'deploying'>, <DeploymentStatus.DELETING: 'deleting'>, <DeploymentStatus.SCALING: 'scaling'>]: Lists transient statutes of the enum DeploymentStatus.

scaleway.inference.v1.content.MODEL_TRANSIENT_STATUSES: List[ModelStatus] = [<ModelStatus.PREPARING: 'preparing'>, <ModelStatus.DOWNLOADING: 'downloading'>]: Lists transient statutes of the enum ModelStatus.

scaleway.inference.v1.marshalling module

scaleway.inference.v1.marshalling.marshal_CreateDeploymentRequest(request: CreateDeploymentRequest, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_CreateEndpointRequest(request: CreateEndpointRequest, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_CreateModelRequest(request: CreateModelRequest, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_DeploymentQuantization(request: DeploymentQuantization, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_EndpointPrivateNetworkDetails(request: EndpointPrivateNetworkDetails, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_EndpointPublicNetworkDetails(request: EndpointPublicNetworkDetails, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_EndpointSpec(request: EndpointSpec, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_ModelSource(request: ModelSource, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_UpdateDeploymentRequest(request: UpdateDeploymentRequest, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.marshal_UpdateEndpointRequest(request: UpdateEndpointRequest, defaults: ProfileDefaults) → Dict[str, Any]

scaleway.inference.v1.marshalling.unmarshal_Deployment(data: Any) → Deployment

scaleway.inference.v1.marshalling.unmarshal_DeploymentQuantization(data: Any) → DeploymentQuantization

scaleway.inference.v1.marshalling.unmarshal_Endpoint(data: Any) → Endpoint

scaleway.inference.v1.marshalling.unmarshal_EndpointPrivateNetworkDetails(data: Any) → EndpointPrivateNetworkDetails

scaleway.inference.v1.marshalling.unmarshal_EndpointPublicNetworkDetails(data: Any) → EndpointPublicNetworkDetails

scaleway.inference.v1.marshalling.unmarshal_ListDeploymentsResponse(data: Any) → ListDeploymentsResponse

scaleway.inference.v1.marshalling.unmarshal_ListModelsResponse(data: Any) → ListModelsResponse

scaleway.inference.v1.marshalling.unmarshal_ListNodeTypesResponse(data: Any) → ListNodeTypesResponse

scaleway.inference.v1.marshalling.unmarshal_Model(data: Any) → Model

scaleway.inference.v1.marshalling.unmarshal_ModelSupportInfo(data: Any) → ModelSupportInfo

scaleway.inference.v1.marshalling.unmarshal_ModelSupportedNode(data: Any) → ModelSupportedNode

scaleway.inference.v1.marshalling.unmarshal_ModelSupportedQuantization(data: Any) → ModelSupportedQuantization

scaleway.inference.v1.marshalling.unmarshal_NodeType(data: Any) → NodeType

scaleway.inference.v1.types module

class scaleway.inference.v1.types.CreateDeploymentRequest(model_id: 'str', node_type_name: 'str', endpoints: 'List[EndpointSpec]', region: 'Optional[ScwRegion]' = None, name: 'Optional[str]' = None, project_id: 'Optional[str]' = None, accept_eula: 'Optional[bool]' = False, tags: 'Optional[List[str]]' = <factory>, min_size: 'Optional[int]' = 0, max_size: 'Optional[int]' = 0, quantization: 'Optional[DeploymentQuantization]' = None)

Bases: object

accept_eula: bool | None = False: If the model has an EULA, you must accept it before proceeding.

The terms of the EULA can be retrieved using the GetModelEula API call.

endpoints: List[EndpointSpec]: List of endpoints to create.

max_size: int | None = 0: Defines the maximum size of the pool.

min_size: int | None = 0: Defines the minimum size of the pool.

model_id: str: ID of the model to use.

name: str | None = None: Name of the deployment.

node_type_name: str: Name of the node type to use.

project_id: str | None = None: ID of the Project to create the deployment in.

quantization: DeploymentQuantization | None = None: Quantization settings to apply to this deployment.

region: str | None = None: Region to target. If none is passed will use default region from the config.

tags: List[str] | None: List of tags to apply to the deployment.

class scaleway.inference.v1.types.CreateEndpointRequest(deployment_id: 'str', endpoint: 'EndpointSpec', region: 'Optional[ScwRegion]' = None)

Bases: object

deployment_id: str: ID of the deployment to create the endpoint for.

endpoint: EndpointSpec: Specification of the endpoint.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.types.CreateModelRequest(source: 'ModelSource', region: 'Optional[ScwRegion]' = None, name: 'Optional[str]' = None, project_id: 'Optional[str]' = None)

Bases: object

name: str | None = None: Name of the model.

project_id: str | None = None: ID of the Project to import the model in.

region: str | None = None: Region to target. If none is passed will use default region from the config.

source: ModelSource: Where to import the model from.

class scaleway.inference.v1.types.DeleteDeploymentRequest(deployment_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

deployment_id: str: ID of the deployment to delete.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.types.DeleteEndpointRequest(endpoint_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

endpoint_id: str: ID of the endpoint to delete.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.types.DeleteModelRequest(model_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

model_id: str: ID of the model to delete.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.types.Deployment(id: 'str', name: 'str', project_id: 'str', status: 'DeploymentStatus', tags: 'List[str]', node_type_name: 'str', endpoints: 'List[Endpoint]', size: 'int', min_size: 'int', max_size: 'int', model_id: 'str', model_name: 'str', region: 'ScwRegion', error_message: 'Optional[str]' = None, quantization: 'Optional[DeploymentQuantization]' = None, created_at: 'Optional[datetime]' = None, updated_at: 'Optional[datetime]' = None)

Bases: object

created_at: datetime | None = None: Creation date of the deployment.

endpoints: List[Endpoint]: List of endpoints.

error_message: str | None = None: Displays information if your deployment is in error state.

id: str: Unique identifier.

max_size: int: Defines the maximum size of the pool.

min_size: int: Defines the minimum size of the pool.

model_id: str: ID of the model used for the deployment.

model_name: str: Name of the deployed model.

name: str: Name of the deployment.

node_type_name: str: Node type of the deployment.

project_id: str: Project ID.

quantization: DeploymentQuantization | None = None: Quantization parameters for this deployment.

region: str: Region of the deployment.

size: int: Current size of the pool.

status: DeploymentStatus: Status of the deployment.

tags: List[str]: List of tags applied to the deployment.

updated_at: datetime | None = None: Last modification date of the deployment.

class scaleway.inference.v1.types.DeploymentQuantization(bits: 'int')

Bases: object

bits: int: The number of bits each model parameter should be quantized to. The quantization method is chosen based on this value.

class scaleway.inference.v1.types.DeploymentStatus(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

CREATING = 'creating'

DELETING = 'deleting'

DEPLOYING = 'deploying'

ERROR = 'error'

LOCKED = 'locked'

READY = 'ready'

SCALING = 'scaling'

UNKNOWN_STATUS = 'unknown_status'

class scaleway.inference.v1.types.Endpoint(id: 'str', url: 'str', disable_auth: 'bool', public_network: 'Optional[EndpointPublicNetworkDetails]' = None, private_network: 'Optional[EndpointPrivateNetworkDetails]' = None)

Bases: object

disable_auth: bool: Defines whether the authentication is disabled.

id: str: Unique identifier.

private_network: EndpointPrivateNetworkDetails | None = None

public_network: EndpointPublicNetworkDetails | None = None

url: str: For private endpoints, the URL will be accessible only from the Private Network.

In addition, private endpoints will expose a CA certificate that can be used to verify the server’s identity. This CA certificate can be retrieved using the GetDeploymentCertificate API call.

class scaleway.inference.v1.types.EndpointPrivateNetworkDetails(private_network_id: 'str')

Bases: object

private_network_id: str

class scaleway.inference.v1.types.EndpointPublicNetworkDetails: Bases: object

class scaleway.inference.v1.types.EndpointSpec(disable_auth: 'bool', public_network: 'Optional[EndpointPublicNetworkDetails]' = None, private_network: 'Optional[EndpointPrivateNetworkDetails]' = None)

Bases: object

disable_auth: bool: By default, deployments are protected by IAM authentication.

When setting this field to true, the authentication will be disabled.

private_network: EndpointPrivateNetworkDetails | None = None

public_network: EndpointPublicNetworkDetails | None = None

class scaleway.inference.v1.types.GetDeploymentCertificateRequest(deployment_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

deployment_id: str

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.types.GetDeploymentRequest(deployment_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

deployment_id: str: ID of the deployment to get.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.types.GetModelRequest(model_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

model_id: str: ID of the model to get.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.types.ListDeploymentsRequest(region: 'Optional[ScwRegion]' = None, page: 'Optional[int]' = 0, page_size: 'Optional[int]' = 0, order_by: 'Optional[ListDeploymentsRequestOrderBy]' = <ListDeploymentsRequestOrderBy.CREATED_AT_DESC: 'created_at_desc'>, project_id: 'Optional[str]' = None, organization_id: 'Optional[str]' = None, name: 'Optional[str]' = None, tags: 'Optional[List[str]]' = <factory>)

Bases: object

name: str | None = None: Filter by deployment name.

order_by: ListDeploymentsRequestOrderBy | None = 'created_at_desc': Order in which to return results.

organization_id: str | None = None: Filter by Organization ID.

page: int | None = 0: Page number to return.

page_size: int | None = 0: Maximum number of deployments to return per page.

project_id: str | None = None: Filter by Project ID.

region: str | None = None: Region to target. If none is passed will use default region from the config.

tags: List[str] | None: Filter by tags.

class scaleway.inference.v1.types.ListDeploymentsRequestOrderBy(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

CREATED_AT_ASC = 'created_at_asc'

CREATED_AT_DESC = 'created_at_desc'

NAME_ASC = 'name_asc'

NAME_DESC = 'name_desc'

class scaleway.inference.v1.types.ListDeploymentsResponse(deployments: 'List[Deployment]', total_count: 'int')

Bases: object

deployments: List[Deployment]: List of deployments on the current page.

total_count: int: Total number of deployments.

class scaleway.inference.v1.types.ListModelsRequest(region: 'Optional[ScwRegion]' = None, order_by: 'Optional[ListModelsRequestOrderBy]' = <ListModelsRequestOrderBy.DISPLAY_RANK_ASC: 'display_rank_asc'>, page: 'Optional[int]' = 0, page_size: 'Optional[int]' = 0, project_id: 'Optional[str]' = None, name: 'Optional[str]' = None, tags: 'Optional[List[str]]' = <factory>)

Bases: object

name: str | None = None: Filter by model name.

order_by: ListModelsRequestOrderBy | None = 'display_rank_asc': Order in which to return results.

page: int | None = 0: Page number to return.

page_size: int | None = 0: Maximum number of models to return per page.

project_id: str | None = None: Filter by Project ID.

region: str | None = None: Region to target. If none is passed will use default region from the config.

tags: List[str] | None: Filter by tags.

class scaleway.inference.v1.types.ListModelsRequestOrderBy(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

CREATED_AT_ASC = 'created_at_asc'

CREATED_AT_DESC = 'created_at_desc'

DISPLAY_RANK_ASC = 'display_rank_asc'

NAME_ASC = 'name_asc'

NAME_DESC = 'name_desc'

class scaleway.inference.v1.types.ListModelsResponse(models: 'List[Model]', total_count: 'int')

Bases: object

models: List[Model]: List of models on the current page.

total_count: int: Total number of models.

class scaleway.inference.v1.types.ListNodeTypesRequest(include_disabled_types: 'bool', region: 'Optional[ScwRegion]' = None, page: 'Optional[int]' = 0, page_size: 'Optional[int]' = 0)

Bases: object

include_disabled_types: bool: Include disabled node types in the response.

page: int | None = 0: Page number to return.

page_size: int | None = 0: Maximum number of node types to return per page.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.types.ListNodeTypesResponse(node_types: 'List[NodeType]', total_count: 'int')

Bases: object

node_types: List[NodeType]: List of node types.

total_count: int: Total number of node types.

class scaleway.inference.v1.types.Model(id: 'str', name: 'str', project_id: 'str', tags: 'List[str]', status: 'ModelStatus', description: 'str', has_eula: 'bool', region: 'ScwRegion', nodes_support: 'List[ModelSupportInfo]', parameter_size_bits: 'int', size_bytes: 'int', error_message: 'Optional[str]' = None, created_at: 'Optional[datetime]' = None, updated_at: 'Optional[datetime]' = None)

Bases: object

created_at: datetime | None = None: Creation date of the model.

description: str: Purpose of the model.

error_message: str | None = None: Displays information if your model is in error state.

has_eula: bool: Defines whether the model has an end user license agreement.

id: str: Unique identifier.

name: str: Unique Name identifier.

nodes_support: List[ModelSupportInfo]: Supported nodes types with quantization options and context lengths.

parameter_size_bits: int: Size, in bits, of the model parameters.

project_id: str: Project ID.

region: str: Region of the model.

size_bytes: int: Total size, in bytes, of the model files.

status: ModelStatus: Status of the model.

tags: List[str]: List of tags applied to the model.

updated_at: datetime | None = None: Last modification date of the model.

class scaleway.inference.v1.types.ModelSource(url: 'str', secret: 'Optional[str]' = None)

Bases: object

secret: str | None = None

url: str

class scaleway.inference.v1.types.ModelStatus(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

DOWNLOADING = 'downloading'

ERROR = 'error'

PREPARING = 'preparing'

READY = 'ready'

UNKNOWN_STATUS = 'unknown_status'

class scaleway.inference.v1.types.ModelSupportInfo(nodes: 'List[ModelSupportedNode]')

Bases: object

nodes: List[ModelSupportedNode]: List of supported node types.

class scaleway.inference.v1.types.ModelSupportedNode(node_type_name: 'str', quantizations: 'List[ModelSupportedQuantization]')

Bases: object

node_type_name: str: Supported node type.

quantizations: List[ModelSupportedQuantization]: Supported quantizations.

class scaleway.inference.v1.types.ModelSupportedQuantization(quantization_bits: 'int', allowed: 'bool', max_context_size: 'int')

Bases: object

allowed: bool: Tells whether this quantization is allowed for this node type.

max_context_size: int: Maximum inference context size available for this node type and quantization.

quantization_bits: int: Number of bits for this supported quantization.

class scaleway.inference.v1.types.NodeType(name: 'str', stock_status: 'NodeTypeStock', description: 'str', vcpus: 'int', memory: 'int', vram: 'int', disabled: 'bool', beta: 'bool', gpus: 'int', region: 'ScwRegion', created_at: 'Optional[datetime]' = None, updated_at: 'Optional[datetime]' = None)

Bases: object

beta: bool: The node type is currently in beta.

created_at: datetime | None = None: Creation date of the node type.

description: str: Current specs of the offer.

disabled: bool: The node type is currently disabled.

gpus: int: Number of GPUs.

memory: int: Quantity of RAM.

name: str: Name of the node type.

region: str: Region of the node type.

stock_status: NodeTypeStock: Current stock status for the node type.

updated_at: datetime | None = None: Last modification date of the node type.

vcpus: int: Number of virtual CPUs.

vram: int: Quantity of GPU RAM.

class scaleway.inference.v1.types.NodeTypeStock(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

AVAILABLE = 'available'

LOW_STOCK = 'low_stock'

OUT_OF_STOCK = 'out_of_stock'

UNKNOWN_STOCK = 'unknown_stock'

class scaleway.inference.v1.types.UpdateDeploymentRequest(deployment_id: 'str', region: 'Optional[ScwRegion]' = None, name: 'Optional[str]' = None, tags: 'Optional[List[str]]' = <factory>, min_size: 'Optional[int]' = 0, max_size: 'Optional[int]' = 0, model_id: 'Optional[str]' = None, quantization: 'Optional[DeploymentQuantization]' = None)

Bases: object

deployment_id: str: ID of the deployment to update.

max_size: int | None = 0: Defines the new maximum size of the pool.

min_size: int | None = 0: Defines the new minimum size of the pool.

model_id: str | None = None: Id of the model to set to the deployment.

name: str | None = None: Name of the deployment.

quantization: DeploymentQuantization | None = None: Quantization to use to the deployment.

region: str | None = None: Region to target. If none is passed will use default region from the config.

tags: List[str] | None: List of tags to apply to the deployment.

class scaleway.inference.v1.types.UpdateEndpointRequest(endpoint_id: 'str', region: 'Optional[ScwRegion]' = None, disable_auth: 'Optional[bool]' = False)

Bases: object

disable_auth: bool | None = False: By default, deployments are protected by IAM authentication.

When setting this field to true, the authentication will be disabled.

endpoint_id: str: ID of the endpoint to update.

region: str | None = None: Region to target. If none is passed will use default region from the config.

Module contents

class scaleway.inference.v1.CreateDeploymentRequest(model_id: 'str', node_type_name: 'str', endpoints: 'List[EndpointSpec]', region: 'Optional[ScwRegion]' = None, name: 'Optional[str]' = None, project_id: 'Optional[str]' = None, accept_eula: 'Optional[bool]' = False, tags: 'Optional[List[str]]' = <factory>, min_size: 'Optional[int]' = 0, max_size: 'Optional[int]' = 0, quantization: 'Optional[DeploymentQuantization]' = None)

Bases: object

accept_eula: bool | None = False: If the model has an EULA, you must accept it before proceeding.

The terms of the EULA can be retrieved using the GetModelEula API call.

endpoints: List[EndpointSpec]: List of endpoints to create.

max_size: int | None = 0: Defines the maximum size of the pool.

min_size: int | None = 0: Defines the minimum size of the pool.

model_id: str: ID of the model to use.

name: str | None = None: Name of the deployment.

node_type_name: str: Name of the node type to use.

project_id: str | None = None: ID of the Project to create the deployment in.

quantization: DeploymentQuantization | None = None: Quantization settings to apply to this deployment.

region: str | None = None: Region to target. If none is passed will use default region from the config.

tags: List[str] | None: List of tags to apply to the deployment.

class scaleway.inference.v1.CreateEndpointRequest(deployment_id: 'str', endpoint: 'EndpointSpec', region: 'Optional[ScwRegion]' = None)

Bases: object

deployment_id: str: ID of the deployment to create the endpoint for.

endpoint: EndpointSpec: Specification of the endpoint.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.CreateModelRequest(source: 'ModelSource', region: 'Optional[ScwRegion]' = None, name: 'Optional[str]' = None, project_id: 'Optional[str]' = None)

Bases: object

name: str | None = None: Name of the model.

project_id: str | None = None: ID of the Project to import the model in.

region: str | None = None: Region to target. If none is passed will use default region from the config.

source: ModelSource: Where to import the model from.

class scaleway.inference.v1.DeleteDeploymentRequest(deployment_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

deployment_id: str: ID of the deployment to delete.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.DeleteEndpointRequest(endpoint_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

endpoint_id: str: ID of the endpoint to delete.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.DeleteModelRequest(model_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

model_id: str: ID of the model to delete.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.Deployment(id: 'str', name: 'str', project_id: 'str', status: 'DeploymentStatus', tags: 'List[str]', node_type_name: 'str', endpoints: 'List[Endpoint]', size: 'int', min_size: 'int', max_size: 'int', model_id: 'str', model_name: 'str', region: 'ScwRegion', error_message: 'Optional[str]' = None, quantization: 'Optional[DeploymentQuantization]' = None, created_at: 'Optional[datetime]' = None, updated_at: 'Optional[datetime]' = None)

Bases: object

created_at: datetime | None = None: Creation date of the deployment.

endpoints: List[Endpoint]: List of endpoints.

error_message: str | None = None: Displays information if your deployment is in error state.

id: str: Unique identifier.

max_size: int: Defines the maximum size of the pool.

min_size: int: Defines the minimum size of the pool.

model_id: str: ID of the model used for the deployment.

model_name: str: Name of the deployed model.

name: str: Name of the deployment.

node_type_name: str: Node type of the deployment.

project_id: str: Project ID.

quantization: DeploymentQuantization | None = None: Quantization parameters for this deployment.

region: str: Region of the deployment.

size: int: Current size of the pool.

status: DeploymentStatus: Status of the deployment.

tags: List[str]: List of tags applied to the deployment.

updated_at: datetime | None = None: Last modification date of the deployment.

class scaleway.inference.v1.DeploymentQuantization(bits: 'int')

Bases: object

bits: int: The number of bits each model parameter should be quantized to. The quantization method is chosen based on this value.

class scaleway.inference.v1.DeploymentStatus(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

CREATING = 'creating'

DELETING = 'deleting'

DEPLOYING = 'deploying'

ERROR = 'error'

LOCKED = 'locked'

READY = 'ready'

SCALING = 'scaling'

UNKNOWN_STATUS = 'unknown_status'

class scaleway.inference.v1.Endpoint(id: 'str', url: 'str', disable_auth: 'bool', public_network: 'Optional[EndpointPublicNetworkDetails]' = None, private_network: 'Optional[EndpointPrivateNetworkDetails]' = None)

Bases: object

disable_auth: bool: Defines whether the authentication is disabled.

id: str: Unique identifier.

private_network: EndpointPrivateNetworkDetails | None = None

public_network: EndpointPublicNetworkDetails | None = None

url: str: For private endpoints, the URL will be accessible only from the Private Network.

In addition, private endpoints will expose a CA certificate that can be used to verify the server’s identity. This CA certificate can be retrieved using the GetDeploymentCertificate API call.

class scaleway.inference.v1.EndpointPrivateNetworkDetails(private_network_id: 'str')

Bases: object

private_network_id: str

class scaleway.inference.v1.EndpointPublicNetworkDetails: Bases: object

class scaleway.inference.v1.EndpointSpec(disable_auth: 'bool', public_network: 'Optional[EndpointPublicNetworkDetails]' = None, private_network: 'Optional[EndpointPrivateNetworkDetails]' = None)

Bases: object

disable_auth: bool: By default, deployments are protected by IAM authentication.

When setting this field to true, the authentication will be disabled.

private_network: EndpointPrivateNetworkDetails | None = None

public_network: EndpointPublicNetworkDetails | None = None

class scaleway.inference.v1.GetDeploymentCertificateRequest(deployment_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

deployment_id: str

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.GetDeploymentRequest(deployment_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

deployment_id: str: ID of the deployment to get.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.GetModelRequest(model_id: 'str', region: 'Optional[ScwRegion]' = None)

Bases: object

model_id: str: ID of the model to get.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.InferenceV1API(client: Client, *, bypass_validation: bool = False)

Bases: API

This API allows you to handle your Managed Inference services.

Create a deployment. Create a new inference deployment related to a specific model. :param model_id: ID of the model to use. :param node_type_name: Name of the node type to use. :param endpoints: List of endpoints to create. :param region: Region to target. If none is passed will use default region from the config. :param name: Name of the deployment. :param project_id: ID of the Project to create the deployment in. :param accept_eula: If the model has an EULA, you must accept it before proceeding. The terms of the EULA can be retrieved using the GetModelEula API call. :param tags: List of tags to apply to the deployment. :param min_size: Defines the minimum size of the pool. :param max_size: Defines the maximum size of the pool. :param quantization: Quantization settings to apply to this deployment. :return: Deployment

Usage:

result = api.create_deployment(
    model_id="example",
    node_type_name="example",
    endpoints=[],
)

create_endpoint(*, deployment_id: str, endpoint: EndpointSpec, region: str | None = None) → Endpoint

Create an endpoint. Create a new Endpoint related to a specific deployment. :param deployment_id: ID of the deployment to create the endpoint for. :param endpoint: Specification of the endpoint. :param region: Region to target. If none is passed will use default region from the config. :return: Endpoint

Usage:

result = api.create_endpoint(
    deployment_id="example",
    endpoint=EndpointSpec(),
)

create_model(*, source: ModelSource, region: str | None = None, name: str | None = None, project_id: str | None = None) → Model

Import a model. Import a new model to your model library. :param source: Where to import the model from. :param region: Region to target. If none is passed will use default region from the config. :param name: Name of the model. :param project_id: ID of the Project to import the model in. :return: Model

Usage:

result = api.create_model(
    source=ModelSource(),
)

delete_deployment(*, deployment_id: str, region: str | None = None) → Deployment

Delete a deployment. Delete an existing inference deployment. :param deployment_id: ID of the deployment to delete. :param region: Region to target. If none is passed will use default region from the config. :return: Deployment

Usage:

result = api.delete_deployment(
    deployment_id="example",
)

delete_endpoint(*, endpoint_id: str, region: str | None = None) → None

Delete an endpoint. Delete an existing Endpoint. :param endpoint_id: ID of the endpoint to delete. :param region: Region to target. If none is passed will use default region from the config.

Usage:

result = api.delete_endpoint(
    endpoint_id="example",
)

delete_model(*, model_id: str, region: str | None = None) → None

Delete a model. Delete an existing model from your model library. :param model_id: ID of the model to delete. :param region: Region to target. If none is passed will use default region from the config.

Usage:

result = api.delete_model(
    model_id="example",
)

get_deployment(*, deployment_id: str, region: str | None = None) → Deployment

Get a deployment. Get the deployment for the given ID. :param deployment_id: ID of the deployment to get. :param region: Region to target. If none is passed will use default region from the config. :return: Deployment

Usage:

result = api.get_deployment(
    deployment_id="example",
)

get_deployment_certificate(*, deployment_id: str, region: str | None = None) → ScwFile

Get the CA certificate. Get the CA certificate used for the deployment of private endpoints. The CA certificate will be returned as a PEM file. :param deployment_id: :param region: Region to target. If none is passed will use default region from the config. :return: ScwFile

Usage:

result = api.get_deployment_certificate(
    deployment_id="example",
)

get_model(*, model_id: str, region: str | None = None) → Model

Get a model. Get the model for the given ID. :param model_id: ID of the model to get. :param region: Region to target. If none is passed will use default region from the config. :return: Model

Usage:

result = api.get_model(
    model_id="example",
)

List inference deployments. List all your inference deployments. :param region: Region to target. If none is passed will use default region from the config. :param page: Page number to return. :param page_size: Maximum number of deployments to return per page. :param order_by: Order in which to return results. :param project_id: Filter by Project ID. :param organization_id: Filter by Organization ID. :param name: Filter by deployment name. :param tags: Filter by tags. :return: ListDeploymentsResponse

Usage:

result = api.list_deployments()

List inference deployments. List all your inference deployments. :param region: Region to target. If none is passed will use default region from the config. :param page: Page number to return. :param page_size: Maximum number of deployments to return per page. :param order_by: Order in which to return results. :param project_id: Filter by Project ID. :param organization_id: Filter by Organization ID. :param name: Filter by deployment name. :param tags: Filter by tags. :return: List[Deployment]

Usage:

result = api.list_deployments_all()

List models. List all available models. :param region: Region to target. If none is passed will use default region from the config. :param order_by: Order in which to return results. :param page: Page number to return. :param page_size: Maximum number of models to return per page. :param project_id: Filter by Project ID. :param name: Filter by model name. :param tags: Filter by tags. :return: ListModelsResponse

Usage:

result = api.list_models()

List models. List all available models. :param region: Region to target. If none is passed will use default region from the config. :param order_by: Order in which to return results. :param page: Page number to return. :param page_size: Maximum number of models to return per page. :param project_id: Filter by Project ID. :param name: Filter by model name. :param tags: Filter by tags. :return: List[Model]

Usage:

result = api.list_models_all()

list_node_types(*, include_disabled_types: bool, region: str | None = None, page: int | None = None, page_size: int | None = None) → ListNodeTypesResponse

List available node types. List all available node types. By default, the node types returned in the list are ordered by creation date in ascending order, though this can be modified via the order_by field. :param include_disabled_types: Include disabled node types in the response. :param region: Region to target. If none is passed will use default region from the config. :param page: Page number to return. :param page_size: Maximum number of node types to return per page. :return: ListNodeTypesResponse

Usage:

result = api.list_node_types(
    include_disabled_types=False,
)

list_node_types_all(*, include_disabled_types: bool, region: str | None = None, page: int | None = None, page_size: int | None = None) → List[NodeType]

List available node types. List all available node types. By default, the node types returned in the list are ordered by creation date in ascending order, though this can be modified via the order_by field. :param include_disabled_types: Include disabled node types in the response. :param region: Region to target. If none is passed will use default region from the config. :param page: Page number to return. :param page_size: Maximum number of node types to return per page. :return: List[NodeType]

Usage:

result = api.list_node_types_all(
    include_disabled_types=False,
)

Update a deployment. Update an existing inference deployment. :param deployment_id: ID of the deployment to update. :param region: Region to target. If none is passed will use default region from the config. :param name: Name of the deployment. :param tags: List of tags to apply to the deployment. :param min_size: Defines the new minimum size of the pool. :param max_size: Defines the new maximum size of the pool. :param model_id: Id of the model to set to the deployment. :param quantization: Quantization to use to the deployment. :return: Deployment

Usage:

result = api.update_deployment(
    deployment_id="example",
)

update_endpoint(*, endpoint_id: str, region: str | None = None, disable_auth: bool | None = None) → Endpoint

Update an endpoint. Update an existing Endpoint. :param endpoint_id: ID of the endpoint to update. :param region: Region to target. If none is passed will use default region from the config. :param disable_auth: By default, deployments are protected by IAM authentication. When setting this field to true, the authentication will be disabled. :return: Endpoint

Usage:

result = api.update_endpoint(
    endpoint_id="example",
)

wait_for_deployment(*, deployment_id: str, region: str | None = None, options: WaitForOptions[Deployment, bool] | None = None) → Deployment

Get a deployment. Get the deployment for the given ID. :param deployment_id: ID of the deployment to get. :param region: Region to target. If none is passed will use default region from the config. :return: Deployment

Usage:

result = api.get_deployment(
    deployment_id="example",
)

wait_for_model(*, model_id: str, region: str | None = None, options: WaitForOptions[Model, bool] | None = None) → Model

Get a model. Get the model for the given ID. :param model_id: ID of the model to get. :param region: Region to target. If none is passed will use default region from the config. :return: Model

Usage:

result = api.get_model(
    model_id="example",
)

class scaleway.inference.v1.ListDeploymentsRequest(region: 'Optional[ScwRegion]' = None, page: 'Optional[int]' = 0, page_size: 'Optional[int]' = 0, order_by: 'Optional[ListDeploymentsRequestOrderBy]' = <ListDeploymentsRequestOrderBy.CREATED_AT_DESC: 'created_at_desc'>, project_id: 'Optional[str]' = None, organization_id: 'Optional[str]' = None, name: 'Optional[str]' = None, tags: 'Optional[List[str]]' = <factory>)

Bases: object

name: str | None = None: Filter by deployment name.

order_by: ListDeploymentsRequestOrderBy | None = 'created_at_desc': Order in which to return results.

organization_id: str | None = None: Filter by Organization ID.

page: int | None = 0: Page number to return.

page_size: int | None = 0: Maximum number of deployments to return per page.

project_id: str | None = None: Filter by Project ID.

region: str | None = None: Region to target. If none is passed will use default region from the config.

tags: List[str] | None: Filter by tags.

class scaleway.inference.v1.ListDeploymentsRequestOrderBy(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

CREATED_AT_ASC = 'created_at_asc'

CREATED_AT_DESC = 'created_at_desc'

NAME_ASC = 'name_asc'

NAME_DESC = 'name_desc'

class scaleway.inference.v1.ListDeploymentsResponse(deployments: 'List[Deployment]', total_count: 'int')

Bases: object

deployments: List[Deployment]: List of deployments on the current page.

total_count: int: Total number of deployments.

class scaleway.inference.v1.ListModelsRequest(region: 'Optional[ScwRegion]' = None, order_by: 'Optional[ListModelsRequestOrderBy]' = <ListModelsRequestOrderBy.DISPLAY_RANK_ASC: 'display_rank_asc'>, page: 'Optional[int]' = 0, page_size: 'Optional[int]' = 0, project_id: 'Optional[str]' = None, name: 'Optional[str]' = None, tags: 'Optional[List[str]]' = <factory>)

Bases: object

name: str | None = None: Filter by model name.

order_by: ListModelsRequestOrderBy | None = 'display_rank_asc': Order in which to return results.

page: int | None = 0: Page number to return.

page_size: int | None = 0: Maximum number of models to return per page.

project_id: str | None = None: Filter by Project ID.

region: str | None = None: Region to target. If none is passed will use default region from the config.

tags: List[str] | None: Filter by tags.

class scaleway.inference.v1.ListModelsRequestOrderBy(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

CREATED_AT_ASC = 'created_at_asc'

CREATED_AT_DESC = 'created_at_desc'

DISPLAY_RANK_ASC = 'display_rank_asc'

NAME_ASC = 'name_asc'

NAME_DESC = 'name_desc'

class scaleway.inference.v1.ListModelsResponse(models: 'List[Model]', total_count: 'int')

Bases: object

models: List[Model]: List of models on the current page.

total_count: int: Total number of models.

class scaleway.inference.v1.ListNodeTypesRequest(include_disabled_types: 'bool', region: 'Optional[ScwRegion]' = None, page: 'Optional[int]' = 0, page_size: 'Optional[int]' = 0)

Bases: object

include_disabled_types: bool: Include disabled node types in the response.

page: int | None = 0: Page number to return.

page_size: int | None = 0: Maximum number of node types to return per page.

region: str | None = None: Region to target. If none is passed will use default region from the config.

class scaleway.inference.v1.ListNodeTypesResponse(node_types: 'List[NodeType]', total_count: 'int')

Bases: object

node_types: List[NodeType]: List of node types.

total_count: int: Total number of node types.

class scaleway.inference.v1.Model(id: 'str', name: 'str', project_id: 'str', tags: 'List[str]', status: 'ModelStatus', description: 'str', has_eula: 'bool', region: 'ScwRegion', nodes_support: 'List[ModelSupportInfo]', parameter_size_bits: 'int', size_bytes: 'int', error_message: 'Optional[str]' = None, created_at: 'Optional[datetime]' = None, updated_at: 'Optional[datetime]' = None)

Bases: object

created_at: datetime | None = None: Creation date of the model.

description: str: Purpose of the model.

error_message: str | None = None: Displays information if your model is in error state.

has_eula: bool: Defines whether the model has an end user license agreement.

id: str: Unique identifier.

name: str: Unique Name identifier.

nodes_support: List[ModelSupportInfo]: Supported nodes types with quantization options and context lengths.

parameter_size_bits: int: Size, in bits, of the model parameters.

project_id: str: Project ID.

region: str: Region of the model.

size_bytes: int: Total size, in bytes, of the model files.

status: ModelStatus: Status of the model.

tags: List[str]: List of tags applied to the model.

updated_at: datetime | None = None: Last modification date of the model.

class scaleway.inference.v1.ModelSource(url: 'str', secret: 'Optional[str]' = None)

Bases: object

secret: str | None = None

url: str

class scaleway.inference.v1.ModelStatus(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

DOWNLOADING = 'downloading'

ERROR = 'error'

PREPARING = 'preparing'

READY = 'ready'

UNKNOWN_STATUS = 'unknown_status'

class scaleway.inference.v1.ModelSupportInfo(nodes: 'List[ModelSupportedNode]')

Bases: object

nodes: List[ModelSupportedNode]: List of supported node types.

class scaleway.inference.v1.ModelSupportedNode(node_type_name: 'str', quantizations: 'List[ModelSupportedQuantization]')

Bases: object

node_type_name: str: Supported node type.

quantizations: List[ModelSupportedQuantization]: Supported quantizations.

class scaleway.inference.v1.ModelSupportedQuantization(quantization_bits: 'int', allowed: 'bool', max_context_size: 'int')

Bases: object

allowed: bool: Tells whether this quantization is allowed for this node type.

max_context_size: int: Maximum inference context size available for this node type and quantization.

quantization_bits: int: Number of bits for this supported quantization.

class scaleway.inference.v1.NodeType(name: 'str', stock_status: 'NodeTypeStock', description: 'str', vcpus: 'int', memory: 'int', vram: 'int', disabled: 'bool', beta: 'bool', gpus: 'int', region: 'ScwRegion', created_at: 'Optional[datetime]' = None, updated_at: 'Optional[datetime]' = None)

Bases: object

beta: bool: The node type is currently in beta.

created_at: datetime | None = None: Creation date of the node type.

description: str: Current specs of the offer.

disabled: bool: The node type is currently disabled.

gpus: int: Number of GPUs.

memory: int: Quantity of RAM.

name: str: Name of the node type.

region: str: Region of the node type.

stock_status: NodeTypeStock: Current stock status for the node type.

updated_at: datetime | None = None: Last modification date of the node type.

vcpus: int: Number of virtual CPUs.

vram: int: Quantity of GPU RAM.

class scaleway.inference.v1.NodeTypeStock(value: str, names: Any | None = None, *args: Any, **kwargs: Any)

Bases: str, Enum

AVAILABLE = 'available'

LOW_STOCK = 'low_stock'

OUT_OF_STOCK = 'out_of_stock'

UNKNOWN_STOCK = 'unknown_stock'

class scaleway.inference.v1.UpdateDeploymentRequest(deployment_id: 'str', region: 'Optional[ScwRegion]' = None, name: 'Optional[str]' = None, tags: 'Optional[List[str]]' = <factory>, min_size: 'Optional[int]' = 0, max_size: 'Optional[int]' = 0, model_id: 'Optional[str]' = None, quantization: 'Optional[DeploymentQuantization]' = None)

Bases: object

deployment_id: str: ID of the deployment to update.

max_size: int | None = 0: Defines the new maximum size of the pool.

min_size: int | None = 0: Defines the new minimum size of the pool.

model_id: str | None = None: Id of the model to set to the deployment.

name: str | None = None: Name of the deployment.

quantization: DeploymentQuantization | None = None: Quantization to use to the deployment.

region: str | None = None: Region to target. If none is passed will use default region from the config.

tags: List[str] | None: List of tags to apply to the deployment.

class scaleway.inference.v1.UpdateEndpointRequest(endpoint_id: 'str', region: 'Optional[ScwRegion]' = None, disable_auth: 'Optional[bool]' = False)

Bases: object

disable_auth: bool | None = False: By default, deployments are protected by IAM authentication.

When setting this field to true, the authentication will be disabled.

endpoint_id: str: ID of the endpoint to update.

region: str | None = None: Region to target. If none is passed will use default region from the config.