Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
915d3ff
Added ability to specify childhoodLocation
antaresc Jan 30, 2019
3e19929
Never change python
antaresc Jan 30, 2019
f4eab25
Merged google master
antaresc Jun 21, 2019
486200e
Adding dockerfiles
antaresc Jun 25, 2019
836e833
Updated workspace
antaresc Jun 25, 2019
3c4f87a
Updated workspace
antaresc Jun 25, 2019
011b073
Fixed test
antaresc Jun 25, 2019
c6e2596
Added verbose output
antaresc Jun 25, 2019
5ac3cdd
Python3
antaresc Jun 25, 2019
891a648
Merge pull request #3 from ACscooter/master-cloud-build
antaresc Jun 25, 2019
504f338
Merged
antaresc Jul 8, 2019
6a0f392
Incremented version number
antaresc Jul 8, 2019
0095540
merged
antaresc Jul 8, 2019
fb2b0e8
feature/api-version-2 (#47)
antaresc Jun 21, 2019
d5c37ec
Make SchoolDistrict containedIn County
Jun 24, 2019
e64cfaa
Feature/api version 2 (#50)
antaresc Jul 10, 2019
c1e6fca
Implemented DCNode constructor and get_property_values (#51)
antaresc Jul 23, 2019
48de76d
Implemented DCQuery, DCNode, DCFrame (#53)
antaresc Jul 25, 2019
8267913
Rename DCFrame, DCNode, DCQuery to Frame, Node, Query, and fix releva…
Spaceenter Jul 25, 2019
09d6742
Remove duplicate definitions of constants.
Spaceenter Jul 26, 2019
72f7147
Remove methods that are not likely to be used often.
Spaceenter Jul 26, 2019
3ac9f76
Implemented API revisions (#57)
antaresc Aug 1, 2019
4efe1aa
add absolute path import for py2 compatibility, other imports for goo…
tjann Aug 1, 2019
2e385f4
Updated docstrings to be more clear (#58)
antaresc Aug 2, 2019
ff78073
Fixed a bug in populations example (#59)
antaresc Aug 2, 2019
f9dd583
added flexibility to use single col dataframe, convert to series for …
tjann Aug 2, 2019
962f5fd
Added unit tests (#61)
antaresc Aug 2, 2019
5782ae8
update API endpoint
tjann Aug 5, 2019
c58f1b7
Added sphinx documentation (#62)
antaresc Aug 6, 2019
ec88afb
Updated client API to reflect new return format of GetPropertyValues …
antaresc Aug 6, 2019
1321851
Added more documentation (#64)
antaresc Aug 6, 2019
c84c0a1
Check in R API Client code (#69)
tjann Aug 7, 2019
0335c7e
Removed out as key from GetPropertyValues (#70)
antaresc Aug 7, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 124 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
__pycache__/
.dat

### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

### Ignore MAC OS System files ###
# General
.DS_Store
.AppleDouble
.LSOverride
.profraw

# Icon must end with two \r
Icon

# Thumbnails
._*

# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent

# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk

### Ignore BAZEL BUILD System files ###
/bazel-*

### R and RStudio ###
.Rproj.user
.Rhistory
.RData
.Ruserdata
datacommons.RCheck
*tar.gz
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Changelog
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ RUN apt-get -q update && \

# Install python
RUN python setup.py -q install
RUN pip3 install --upgrade requests

# Run the tests
RUN ./build.sh
2 changes: 2 additions & 0 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ workspace(name="datacommons")

load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")

load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")

# The following rules are needed to perform pip-install of dependencies.
# Reference: https://github.com/bazelbuild/rules_python
git_repository(
Expand Down
2 changes: 2 additions & 0 deletions api-R/.Rbuildignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
^.*\.Rproj$
^\.Rproj\.user$
26 changes: 26 additions & 0 deletions api-R/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Package: datacommons
Title: Data Commons REST API R Client
Version: 2.0
Authors@R: person("Tiffany", "Jann", email = "[email protected]", role = c("aut", "cre"))
Description: A RESTful R API Client for querying the DataCommons.org
Open Knowledge Graph. The Node API endpoint is wrapped using Reticulate.
The Query API endpoint is implemented in pure R.
WiFi is needed for all functions in this package.
You can find our code here:
https://github.com/google/datacommons/
Depends: R (>= 3.6.0),
tidyverse,
httr,
jsonlite,
reticulate
License: Apache License, Version 2.0, MIT+
Encoding: UTF-8
LazyData: true
Imports:
tidyverse,
httr,
jsonlite,
reticulate
RoxygenNote: 6.1.1
Suggests:
testthat
9 changes: 9 additions & 0 deletions api-R/NAMESPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Generated by roxygen2: do not edit by hand

export(GetObservations)
export(GetPlacesIn)
export(GetPopulations)
export(GetPropertyLabels)
export(GetPropertyValues)
export(GetTriples)
export(Query)
112 changes: 112 additions & 0 deletions api-R/R/node-core.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Data Commons Node API - Core Functions
#
# These functions provide R access to core
# Data Commons Node API functions:
# get_property_labels, get_property_values, get_triples.
# www.DataCommons.org


#' Return property labels of specified nodes
#'
#' Returns a map between nodes and outgoing (default) or incoming
#' property labels.
#'
#' @param dcids required, vector of string(s) that identify node(s) to get
#' property labels for.
#' @param outgoing optional, boolean indicating whether to get properties
#' originating from
#' the given node. TRUE by default.
#' @return Named list of properties associated with the given dcid(s) via the
#' given direction.
#' @export
#' @examples
#' # dcid string of Santa Clara County.
#' sccDcid <- 'geoId/06085'
#' # Get incoming and outgoing properties for Santa Clara County.
#' inLabels <- GetPropertyLabels(sccDcid, outgoing = FALSE)
#' outLabels <- GetPropertyLabels(sccDcid)
#'
#' # List of dcid strings of Florida, Planned Parenthood West, and the
#' # Republican Party.
#' dcids <- c('geoId/12', 'plannedParenthood-PlannedParenthoodWest',
#' 'politicalParty/RepublicanParty')
#' # Get incoming and outgoing properties for Santa Clara County.
#' inLabels <- GetPropertyLabels(dcids, outgoing = FALSE)
#' outLabels <- GetPropertyLabels(dcids)
GetPropertyLabels <- function(dcids, outgoing = TRUE) {
dcids = ConvertibleToPythonList(dcids)
return(dc$get_property_labels(dcids, outgoing))
}

#' Return property values along a property for one or more nodes
#'
#' Returns all neighboring nodes of each specified node via the specified
#' property and direction. The neighboring nodes are "values" for the
#' property and can be leaf (primitive) nodes.
#'
#' @param dcids required, vector OR single-column tibble/data frame of
#' string(s) that uniquely identify node(s) to get property values for.
#' @param prop required, string identifying the property to get the property
#' values for.
#' @param outgoing optional, boolean indicating whether the property
#' originates from the given node. TRUE by default.
#' @param valueType optional, string identifying the node type to filter the
#' results by. NULL by default.
#' @param limit optional, integer indicating the maximum number of values to
#' return across all properties. 100 by default.
#' @return Named list or column of values associated to given dcid(s) via the
#' given property and direction.
#' Will be encapsulated in a named list if dcids input is vector of strings,
#' or a new single column tibble if dcids input is tibble/data frame.
#' @export
#' @examples
#' # Set the dcid to be that of Santa Clara County.
#' sccDcid <- 'geoId/06085'
#' # Get the landArea value of Santa Clara (a leaf node).
#' landArea <- GetPropertyValues(sccDcid, 'landArea')
#'
#' # Create a vector with Santa Clara and Miami-Dade County dcids
#' countyDcids <- c('geoId/06085', 'geoId/12086')
#' # Get all containing Cities.
#' cities <- GetPropertyValues(countyDcids, 'containedInPlace',
#' outgoing = FALSE, valueType = 'City')
#'
#'# Create a data frame with Santa Clara and Miami-Dade County dcids
#' df <- data.frame(countyDcid = c('geoId/06085', 'geoId/12086'))
#' # Get all containing Cities.
#' df$cityDcid <- GetPropertyValues(select(df, countyDcid), 'containedInPlace',
#' outgoing = FALSE, valueType = 'City')
GetPropertyValues <- function(dcids, prop, outgoing = TRUE, valueType = NULL,
limit = 100) {
dcids = ConvertibleToPythonList(dcids)
return(dc$get_property_values(dcids, prop, outgoing, valueType, limit))
}

#' Return all triples involving specified nodes
#'
#' Returns all triples (subject-predicate-object) where the specified node is
#' either a subject or an object.
#'
#' @param dcids required, vector of string(s) that uniquely identify
#' the node(s) to get triples for.
#' @param limit optional, integer indicating the max number of triples to get
#' for each property. 100 by default.
#' @return Map between each dcid and all triples with the dcid as the subject or
#' object. Triples are represented as (subject, predicate, object).
#' @export
#' @examples
#' # Set the dcid to be that of Santa Clara County.
#' sccDcid <- 'geoId/06085'
#' # Get triples.
#' triples <- GetTriples(sccDcid)
#'
#' # List of dcid strings of Florida, Planned Parenthood West, and the
#' # Republican Party.
#' dcids <- c('geoId/12', 'plannedParenthood-PlannedParenthoodWest',
#' 'politicalParty/RepublicanParty')
#' # Get triples.
#' triples <- GetPropertyLabels(dcids)
GetTriples <- function(dcids, limit = 100) {
dcids = ConvertibleToPythonList(dcids)
return(dc$get_triples(dcids, limit))
}
46 changes: 46 additions & 0 deletions api-R/R/places.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Data Commons Node API - Places Convenience Function
#
# GetPlacesIn
#
# These functions provide R access to
# Data Commons Node API Places convenience functions.
# These functions are designed to making adding a new column
# to a data frame convenient!
# www.DataCommons.org

#' Return places of a specified type contained in specified places
#'
#' Returns a mapping between the specified places
#' and the places of a specified type contained in them.
#'
#' Assigning output to a tibble/data frame will yield a list of contained
#' places. To convert this to 1-to-1 mapping (the containing place will
#' be repeated), use \code{tidyr::unnest}.
#'
#' @param dcids required, dcid(s) identifying a containing place.
#' This parameter will accept a vector of strings
#' or a single-column tibble/data frame of strings.
#' To select a single column, use \code{select(df, col)}.
#' @param placeType required, string identifying the type of place to query for.
#' @return Named list or column of places contained in each given dcid of the
#' given placeType. If dcids input is vector of strings, will return a named
#' list. If dcids input is tibble/data frame, will return a new single-column
#' tibble/data frame.
#' @export
#' @examples
#' # Atomic vector of the dcids of Santa Clara and Montgomery County.
#' countyDcids <- c('geoId/06085', 'geoId/24031')
#' # Get towns in Santa Clara and Montgomery County.
#' towns <- GetPlacesIn(countyDcids, 'Town')
#'
#' # Tibble of the dcids of Santa Clara and Montgomery County.
#' df <- tibble(countyDcid = c('geoId/06085', 'geoId/24031'))
#' # Get towns in Santa Clara and Montgomery County.
#' df$townDcid <- GetPlacesIn(df, 'Town')
#' # Since GetPlacesIn returned a mapping between counties and
#' # a list of towns, use you can use tidyr::unnest to create
#' # a 1-1 mapping between each county and its towns.
GetPlacesIn <- function(dcids, placeType) {
dcids = ConvertibleToPythonList(dcids)
return(dc$get_places_in(dcids, placeType))
}
Loading