Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add databricks_function resource #4189

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
142 changes: 142 additions & 0 deletions docs/resources/function.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
---
subcategory: "Unity Catalog"
---
# databricks_function Resource

-> This resource source can only be used with a workspace-level provider.

Creates a [User-Defined Function (UDF)](https://docs.databricks.com/en/udf/unity-catalog.html) in Unity Catalog. UDFs can be defined using SQL, or external languages (e.g., Python) and are stored within [Unity Catalog schemas](../resources/schema.md).

## Example Usage

### SQL-based function:

```hcl
resource "databricks_catalog" "sandbox" {
name = "sandbox_example"
comment = "Catalog managed by Terraform"
}

resource "databricks_schema" "functions" {
catalog_name = databricks_catalog.sandbox.name
name = "functions_example"
comment = "Schema managed by Terraform"
}

resource "databricks_function" "calculate_bmi" {
name = "calculate_bmi"
catalog_name = databricks_catalog.sandbox.name
schema_name = databricks_schema.functions.name
input_params = {
parameters = [
{
name = "weight"
type_name = "DOUBLE"
},
{
name = "height"
type_name = "DOUBLE"
}
]
}
data_type = "DOUBLE"
routine_body = "SQL"
routine_definition = "weight / (height * height)"
is_deterministic = true
sql_data_access = "CONTAINS_SQL"
security_type = "DEFINER"
}
```

### Python-based function:

```hcl
resource "databricks_function" "calculate_bmi_py" {
name = "calculate_bmi_py"
catalog_name = databricks_catalog.sandbox.name
schema_name = databricks_schema.functions.name
input_params = {
parameters = [
{
name = "weight_kg"
type = "DOUBLE"
},
{
name = "height_m"
type = "DOUBLE"
}
]
}
data_type = "DOUBLE"
routine_body = "EXTERNAL"
routine_definition = "return weight_kg / (height_m ** 2)"
language = "Python"
is_deterministic = false
sql_data_access = "NO_SQL"
security_type = "DEFINER"
}
```

## Argument Reference

The following arguments are supported:

* `name` - (Required) The name of the function.
* `catalog_name` - (Required) The name of the parent [databricks_catalog](../resources/catalog.md).
* `schema_name` - (Required) The name of [databricks_schema](../resources/schema.md) where the function will reside.
* `input_params` - (Required) A list of objects specifying the input parameters for the function.
* `name` - (Required) The name of the parameter.
* `type` - (Required) The data type of the parameter (e.g., `DOUBLE`, `INT`, etc.).
* `data_type` - (Required) The return data type of the function (e.g., `DOUBLE`).
* `full_data_type` - (Required) Pretty printed function data type (e.g. `string`).
* `return_params` - (Optional) A list of objects specifying the function's return parameters.
* `parameters` - (Required) An array of objects describing the function's return parameters. Each object includes:
* `name` - (Required) The name of the return parameter.
* `type_text` - (Required) The full data type specification as SQL/catalog string text.
* `type_json` - The full data type specification as JSON-serialized text.
* `type_name` - (Required) The name of the data type (e.g., `BOOLEAN`, `INT`, `STRING`, etc.).
* `type_precision` - (Required for `DecimalTypes`) Digits of precision for the type.
* `type_scale` - (Required for `DecimalTypes`) Digits to the right of the decimal for the type.
* `type_interval_type` - The format of `IntervalType`.
* `position` - (Required) The ordinal position of the parameter (starting at 0).
* `parameter_mode` - The mode of the parameter. Possible value: `IN`.
* `parameter_type` - The type of the parameter. Possible values:
* `PARAM` - Represents a generic parameter.
* `COLUMN` - Represents a column parameter.
* `parameter_default` - The default value for the parameter, if any.
* `comment` - User-provided free-form text description of the parameter.
* `routine_definition` - (Required) The actual definition of the function, expressed in SQL or the specified external language.
* `routine_dependencies` - (Optional) A list of objects specifying the function's dependencies. Each object includes:
* `dependencies` - (Optional) An array of objects describing the dependencies. Each object includes:
* `table` - (Optional) An object representing a table that is dependent on the SQL object.
* `function` - (Optional) An object representing a function that is dependent on the SQL object.
* `is_deterministic`- (Required, `bool`) Whether the function is deterministic. Default is `true`.
* `is_null_call` - (Required, `bool`) Indicates whether the function should handle `NULL` input arguments explicitly.
* `specific_name` - (Required) Specific name of the function. Reserverd for future use.
* `external_name` - (Optional) External function name.
* `sql_path` - (Optional) The fully qualified SQL path where the function resides, including catalog and schema information.
* `comment` - (Optional) User-provided free-form text description.
* `properties` - (Optional) A key-value pair object representing additional metadata or attributes associated with the function.
* `routine_body` - (Required) Specifies the body type of the function, either `SQL` for SQL-based functions or `EXTERNAL` for functions in external languages.
* `security_type` - (Required) The security type of the function, generally `DEFINER`.
* `sql_data_access`- (Required) The SQL data access level for the function. Possible values are:
* `CONTAINS_SQL` - The function contains SQL statements.
* `READS_SQL_DATA` - The function reads SQL data but does not modify it.
* `NO_SQL` - The function does not contain SQL.
* `parameter_style` - (Required) Function parameter style (e.g, `S` for SQL).

## Attribute Reference

In addition to all arguments above, the following attributes are exported:
* `full_name` - Full name of the function in the form of `catalog_name.schema_name.function_name`.
* `created_at` - The time when this function was created, in epoch milliseconds.
* `created_by` - The username of the function's creator.
* `updated_at` - The time when this function was last updated, in epoch milliseconds.
* `updated_by` - The username of the last user to modify the function.

## Related Resources

The following resources are used in the same context:

* [databricks_schema](./schema.md) to get information about a single schema
* Data source [databricks_functions](../data-sources/functions.md) to get a list of functions under a specified location.
2 changes: 2 additions & 0 deletions internal/providers/pluginfw/pluginfw_rollout_utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,9 @@ var migratedDataSources = []func() datasource.DataSource{
// List of resources that have been onboarded to the plugin framework - not migrated from sdkv2.
// Keep this list sorted.
var pluginFwOnlyResources = []func() resource.Resource{
// TODO Add resources here
app.ResourceApp,
catalog.ResourceFunction,
sharing.ResourceShare,
}

Expand Down
202 changes: 202 additions & 0 deletions internal/providers/pluginfw/products/catalog/resource_function.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
package catalog

import (
"context"

"github.com/databricks/databricks-sdk-go/apierr"
"github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/terraform-provider-databricks/common"
pluginfwcommon "github.com/databricks/terraform-provider-databricks/internal/providers/pluginfw/common"
pluginfwcontext "github.com/databricks/terraform-provider-databricks/internal/providers/pluginfw/context"
"github.com/databricks/terraform-provider-databricks/internal/providers/pluginfw/converters"
"github.com/databricks/terraform-provider-databricks/internal/providers/pluginfw/tfschema"
"github.com/databricks/terraform-provider-databricks/internal/service/catalog_tf"
"github.com/hashicorp/terraform-plugin-framework/path"
"github.com/hashicorp/terraform-plugin-framework/resource"
"github.com/hashicorp/terraform-plugin-framework/resource/schema"
)

const resourceName = "function"

var _ resource.ResourceWithConfigure = &FunctionResource{}

func ResourceFunction() resource.Resource {
return &FunctionResource{}
}

type FunctionResource struct {
Client *common.DatabricksClient
}

func (r *FunctionResource) Metadata(ctx context.Context, req resource.MetadataRequest, resp *resource.MetadataResponse) {
resp.TypeName = pluginfwcommon.GetDatabricksProductionName(resourceName)
}

func (r *FunctionResource) Schema(ctx context.Context, req resource.SchemaRequest, resp *resource.SchemaResponse) {
attrs, blocks := tfschema.ResourceStructToSchemaMap(ctx, catalog_tf.FunctionInfo{}, func(c tfschema.CustomizableSchema) tfschema.CustomizableSchema {
c.SetRequired("name")
c.SetRequired("catalog_name")
c.SetRequired("schema_name")
c.SetRequired("input_params")
c.SetRequired("data_type")
c.SetRequired("full_data_type")
c.SetRequired("routine_defintion")
c.SetRequired("is_deterministic")
c.SetRequired("is_null_call")
c.SetRequired("specific_name")
c.SetRequired("routine_body")
c.SetRequired("security_type")
c.SetRequired("sql_data_access")
c.SetRequired("parameter_style")
Comment on lines +37 to +50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these actually needed? As in, if you don't have them, are they treated as optional?

Copy link
Contributor Author

@dgomez04 dgomez04 Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based myself on the REST API reference, did I interpret it correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are in the REST API reference, they are in our API specification. It's probable that ResourceStructToSchema() generates a schema with these already set. You can verify by seeing that we don't specify these fields as optional in the TFSDK structs:

	// Function parameter style. **S** is the value for SQL.
	ParameterStyle types.String `tfsdk:"parameter_style" tf:""

So yes, anything that is marked properly in the OpenAPI spec can be removed from here.


c.SetReadOnly("full_name")
c.SetReadOnly("created_at")
c.SetReadOnly("created_by")
c.SetReadOnly("updated_at")
c.SetReadOnly("updated_by")

return c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at UpdateFunction, and it seems like the only thing you can update about a function is its owner. Everything else cannot be updated. Can we mark all other fields with SetForceNew()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetForceNew() doesn't seem to be a method of tfschema.CustomizableSchema. Am I looking at the wrong place? Any guidance here appreciate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, that is how you did this with SDKv2. This is now done via plan modifiers. For example:

import 	"github.com/hashicorp/terraform-plugin-framework/resource/schema/stringplanmodifier"
...

cs.AddPlanModifier(stringplanmodifier.RequiresReplace(), "name")

You can interpret this as: when computing the plan, if there is a change to the "name" field, the entire resource must be replaced.

Do this for each field that cannot be updated.

})

resp.Schema = schema.Schema{
Description: "Terraform schema for Databricks Function",
Attributes: attrs,
Blocks: blocks,
}
}

func (r *FunctionResource) Configure(ctx context.Context, req resource.ConfigureRequest, resp *resource.ConfigureResponse) {
if r.Client == nil && req.ProviderData != nil {
r.Client = pluginfwcommon.ConfigureResource(req, resp)
}
}

func (r *FunctionResource) ImportState(ctx context.Context, req resource.ImportStateRequest, resp *resource.ImportStateResponse) {
resource.ImportStatePassthroughID(ctx, path.Root("full_name"), req, resp)
}

func (r *FunctionResource) Create(ctx context.Context, req resource.CreateRequest, resp *resource.CreateResponse) {
ctx = pluginfwcontext.SetUserAgentInResourceContext(ctx, resourceName)
w, diags := r.Client.GetWorkspaceClient()
resp.Diagnostics.Append(diags...)
if resp.Diagnostics.HasError() {
return
}

var planFunc catalog_tf.FunctionInfo
resp.Diagnostics.Append(req.Plan.Get(ctx, &planFunc)...)
if resp.Diagnostics.HasError() {
return
}

var createReq catalog.CreateFunctionRequest

resp.Diagnostics.Append(converters.TfSdkToGoSdkStruct(ctx, planFunc, &createReq)...)
if resp.Diagnostics.HasError() {
return
}

funcInfo, err := w.Functions.Create(ctx, createReq)
if err != nil {
resp.Diagnostics.AddError("failed to create function", err.Error())
return
}

resp.Diagnostics.Append(converters.GoSdkToTfSdkStruct(ctx, funcInfo, &planFunc)...)
if resp.Diagnostics.HasError() {
return
}

resp.Diagnostics.Append(resp.State.Set(ctx, planFunc)...)
}

func (r *FunctionResource) Update(ctx context.Context, req resource.UpdateRequest, resp *resource.UpdateResponse) {
ctx = pluginfwcontext.SetUserAgentInResourceContext(ctx, resourceName)
w, diags := r.Client.GetWorkspaceClient()
resp.Diagnostics.Append(diags...)
if resp.Diagnostics.HasError() {
return
}

var planFunc catalog_tf.FunctionInfo
resp.Diagnostics.Append(req.Plan.Get(ctx, &planFunc)...)
if resp.Diagnostics.HasError() {
return
}

var updateReq catalog.UpdateFunction

resp.Diagnostics.Append(converters.TfSdkToGoSdkStruct(ctx, planFunc, &updateReq)...)
if resp.Diagnostics.HasError() {
return
}

funcInfo, err := w.Functions.Update(ctx, updateReq)
if err != nil {
resp.Diagnostics.AddError("failed to update function", err.Error())
return
}

resp.Diagnostics.Append(converters.GoSdkToTfSdkStruct(ctx, funcInfo, &planFunc)...)
if resp.Diagnostics.HasError() {
return
}

resp.Diagnostics.Append(resp.State.Set(ctx, planFunc)...)
}

func (r *FunctionResource) Read(ctx context.Context, req resource.ReadRequest, resp *resource.ReadResponse) {
ctx = pluginfwcontext.SetUserAgentInResourceContext(ctx, resourceName)

w, diags := r.Client.GetWorkspaceClient()
resp.Diagnostics.Append(diags...)
if resp.Diagnostics.HasError() {
return
}

var stateFunc catalog_tf.FunctionInfo

resp.Diagnostics.Append(req.State.Get(ctx, &stateFunc)...)
if resp.Diagnostics.HasError() {
return
}

funcName := stateFunc.FullName.ValueString()

funcInfo, err := w.Functions.GetByName(ctx, funcName)
if err != nil {
if apierr.IsMissing(err) {
resp.State.RemoveResource(ctx)
return
}
resp.Diagnostics.AddError("failed to get function", err.Error())
return
}

resp.Diagnostics.Append(converters.GoSdkToTfSdkStruct(ctx, funcInfo, &stateFunc)...)
if resp.Diagnostics.HasError() {
return
}

resp.Diagnostics.Append(resp.State.Set(ctx, stateFunc)...)
}

func (r *FunctionResource) Delete(ctx context.Context, req resource.DeleteRequest, resp *resource.DeleteResponse) {
ctx = pluginfwcontext.SetUserAgentInResourceContext(ctx, resourceName)
w, diags := r.Client.GetWorkspaceClient()
resp.Diagnostics.Append(diags...)
if resp.Diagnostics.HasError() {
return
}

var deleteReq catalog_tf.DeleteFunctionRequest
resp.Diagnostics.Append(req.State.GetAttribute(ctx, path.Root("full_name"), &deleteReq.Name)...)
if resp.Diagnostics.HasError() {
return
}

err := w.Functions.DeleteByName(ctx, deleteReq.Name.ValueString())
if err != nil && !apierr.IsMissing(err) {
resp.Diagnostics.AddError("failed to delete function", err.Error())
}
}
Loading
Loading