2021年4月5日星期一

R DBI SQL Server: dbWriteTable truncates rows / field.types parameter does not work

I am using a SQL Server database and JDBC and pool with the sqljdbc42.jar driver.

Code:

library(DBI)  library(RJDBC)  library(pool)    jar.path.ms.sql.driver <- "./www/base/sql_drivers/sqljdbc42.jar"    jdbc.sql.driver <- JDBC(    driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver",    classPath   = jar.path.ms.sql.driver  )  sql.url <- "jdbc:sqlserver://sql01" # our server url  sql.dname <-  "my_projects"  sql.username <- "user"  sql.password <- "password"    pool <- dbPool(    drv      = jdbc.sql.driver,    url      = sql.url,    dname    = sql.dname,    username = sql.username,    password = sql.password  )  

So far, so good. I can read tables from the database using dbReadTable. I can also write tables to the database using dbWriteTable. For the write case, I have columns in the data frame to be written with very long strings. In some cases the strings are longer than 255 characters.

I had hoped to use the field.types parameter of the dbWriteTable function to configure the SQL Server columns correctly (comments is my data frame to write to the database):

conn <- poolCheckout(pool)    DBI::dbWriteTable(conn = conn,     DBI::Id(cluster = "my_projects", schema = "dbo", table = "comments"),     value = comments,     overwrite = TRUE,    row.names = FALSE,    field.types = c(                 STRATEGIC_AREA = "varchar(255)",                 OBJECTIVE = "varchar(255)",                 METHOD_OF_MEASURE = "varchar(MAX)",                 TARGET = "float",                 UNIT = "varchar(255)",                 MIN_MAX = "varchar(255)",                 JUL = "varchar(1024)",                 AUG = "varchar(MAX)",                 SEP = "varchar(MAX)",                 OCT = "varchar(MAX)",                 NOV = "varchar(MAX)",                 DEC = "varchar(MAX)",                 JAN = "varchar(MAX)",                 FEB = "varchar(MAX)",                 MAR = "varchar(MAX)",                 APR = "varchar(MAX)",                 MAY = "varchar(MAX)",                 JUN = "varchar(MAX)"    ))  poolReturn(conn)  poolClose(pool)  

The problem is that the SQL Server database ignores the column classes and is created consistently with varchar(255) columns. Column classes such as varchar(1024) or varchar(MAX) in field.types are ignored. The function appears to follow a mapping as per the dataframe column classes:

  • char -> varchar(255)
  • num -> float

And it appears to make no difference if I replace DBI::Id(cluster = "my_projects", schema = "dbo", table = "comments") with "my_projects.dbo.comments". It also does not matter if I quote the left hand side of the named character vector I pass to field.types.

As a result, the data frame rows that contain strings longer than 255 characters are truncated (those lines are skipped) and I end up with fewer rows in the SQL Server table compared to the data frame. Error message (the very long string sits in the FEB column):

Error in .local(conn, statement, ...) :
execute JDBC update query failed in dbSendUpdate (String or binary data would be truncated in table 'my_projects.dbo.comments', column 'FEB'. Truncated value:

What am I missing?

https://stackoverflow.com/questions/66864660/r-dbi-sql-server-dbwritetable-truncates-rows-field-types-parameter-does-not-w March 30, 2021 at 12:07PM

没有评论:

发表评论