-
Notifications
You must be signed in to change notification settings - Fork 250
Checking algorithm parameters for validity #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… testUtil for checking that illegal arguments are throwing IlligalArgumentExecption
Codecov Report
@@ Coverage Diff @@
## master #182 +/- ##
==========================================
+ Coverage 88.14% 88.81% +0.66%
==========================================
Files 17 17
Lines 717 733 +16
Branches 56 56
==========================================
+ Hits 632 651 +19
+ Misses 85 82 -3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are great to have and think perhaps as a follow up we should:
- add documentation of the acceptable values, so they can be more discoverable
- add default values and documentation of those
what do you think?
@@ -45,7 +45,7 @@ class ShortestPaths private[graphframes] (private val graph: GraphFrame) extends | |||
* The list of landmark vertex ids. Shortest paths will be computed to each landmark. | |||
*/ | |||
def landmarks(value: Seq[Any]): this.type = { | |||
// TODO(tjh) do some initial checks here, without running queries. | |||
require(value.nonEmpty, "expecting at least one landmark") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also check for the type within the Seq?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracing through this implementation, and correct me if I'm wrong, it looks like graphframe vertex IDs can be of any type allowable within a spark Dataframe. I can add a pattern match check for all of those types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right - so perhaps not a good idea to duplicate the check here, if Spark DataFrame is already checking it?
resetProb = Some(value) | ||
this | ||
} | ||
|
||
def tol(value: Double): this.type = { | ||
require(value >= 0, s"Tolerance must be no less than 0, but got ${value}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave this as tol
for consistency with other checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
@@ -80,16 +80,20 @@ class PageRank private[graphframes] ( | |||
|
|||
/** Reset probability "alpha" */ | |||
def resetProbability(value: Double): this.type = { | |||
require(value >= 0 && value <= 1, s"Random reset probability must belong" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave this as resetProbability
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
@felixcheung Agreed some documentation and default values would be helpful. I can add most of them. Though, I'm not familiar enough with the gamma parameters of SVDPlusPlus. So, if you or someone else can shed light or point me towards existing documentation for these parameters, that would be much appreciated. Thanks! |
unfortunately there isn't much information on SVDPlusPlus - let's skip that one for now |
@rjurney @SauronShepherd I like this PR. It may be slightly incomplete, but it is still a good starting point. What do you think about it? If you are OK with it, I can try to resolve conflicts and we can merge it. |
It's not a bad idea, but maybe we should stop for a moment and rethink all these config-related issues/new proposals. |
If it helps and makes sense, I like the PR. If we want a more complete approach, we can do that too but let’s not hold up contributions that aren’t sweeping as it precludes most people from contributing. |
Adding some checks for parameters of various algorithms within GraphFrames #76 . TODO: check to see if SVD++ parameter requirements can be made more specific.